Fourth Joint Sheffield Conference
on Chemoinformatics - a Report
June 18-20, University of Sheffield, UK
Dr. Wendy Warr,
Editorial Advisor and columnist of QSAR World reports on the recently
concluded IV Joint Sheffield Conference on Chemoinformatics from
University of Sheffield, UK. Over to Wendy...
Download PDF Version
The conference, sponsored by the Chemical Structure Association Trust and the Molecular Graphics and Modelling Society and organized by the University of Sheffield chemoinformatics research group, takes place every three years in the year preceding the Noordwijkerhout International Chemical Structures Conference.
This fourth conference was attended by 230 delegates, the number being
set by the size of the dining facilities at Chatsworth House, the site
of the superb conference outing where we were shown some of the state
rooms and enjoyed an excellent dinner. It is clear that the meeting has
become very popular since all places were taken by the end of the early
registration period, with delegates coming from Australia, Austria,
Belgium, Cyprus, Denmark, France, Germany, Hungary, India, Iran,
Ireland, Italy, Netherlands, Poland, Serbia, Spain, Sweden,
Switzerland, the Ukraine, the United Kingdom and the United States.
Twenty-four papers were presented, in
sessions entitled structure-based design, new algorithms and
techniques, deriving structure-activity relationships, clustering, and
QSAR and ADMET. More than sixty posters were presented. In this report
I am summarizing only the QSAR-related papers, which means I am obliged
to omit some of the material that I myself found most interesting. It
is a shame to have to ignore, for example, the excellent paper by Andy
Good on the defects of enrichment studies in the comparison of virtual
screening (i.e. docking) tools. It is unfortunate that I have to gloss
over controversial comments from Anthony Nicholls ("docking sucks", and
"you cannot calculate binding energy") and his attack on Richards and
Ballester's Ultrafast Shape Recognition. Indeed, Nicholls' own paper
was controversial in itself.
Nicola Richmond of GlaxoSmithKline presented
a fast, novel, graph-matching algorithm, based on the comparison of
distance degree sequences. The algorithm matches pairs of nodes, one
from each graph, by solving the linear assignment problem. The graph
similarity is then given by the minimum cost associated with the
optimal set of matching pairs of nodes. By representing molecules as 2D
topological pharmacophores, Richmond has adapted the algorithm to rank
a corporate collection against a query molecule of interest, and to
cluster the ranked list into groups of compounds that have identical
chemical graphs. The clustering component has a useful visualization
facility. The highest ranked compounds correspond to the analogues of
the query; families of "lead hops" follow. This unsupervised approach
is not a substitute for substructure search but it is fast and it may
produce a new template around which a chemist can search. It can follow
GSK's automated high throughput screening process to recover not only
families of compounds on which to build structure activity
relationships, but also hits missed by high throughput screening (HTS).
Enriched scaffolds in HTS data sets can be
identified by clustering on substructure and then extracting the
maximal common substructure (MCS) for each cluster. However, if
clustering is performed without reference to the assay data, the
resulting scaffolds are unlikely to show optimal enrichment for the
assay in question. Martin Packer of AstraZeneca has developed a method
for locating scaffolds with high enrichment factors, using a
hierarchical search strategy. Molecules encoded by substructure are
partitioned into N clusters and for each cluster, M hierarchical
clusters are generated. The MCS is extract from each cluster and an
enrichment factor is computed. The enrichment factor is calculated for
each maximal common substructure. The procedure is iterated by setting
M: = M-1. The method was applied to a 540,000-compound in-house kinase
data set and 6,737 actives were partitioned into 200 clusters. The
AstraZeneca collection contains lots of kinase series; so a Bonferroni
test was applied to correct for the chance of generating a spurious
result. The hierarchical nature of the search means that
structure-activity relationships emerge for the most enriched
scaffolds. Emergent SAR was found for a quinazoline scaffold:
substitution at the 7-position enhanced enrichment.
|