A Report on Fourth Joint Sheffield Conference on Chemoinformatics June 18-20, University of Sheffield, UK
Bob Clark of Tripos Discovery Informatics has been looking for answers to "the alignment problem". When no structure is available, researchers must fall back on pharmacophore matching or comparative molecular field analysis (CoMFA). Unfortunately, ligand binding often induces structural changes that significantly reduce the usefulness of apoprotein structures for docking and scoring. In such cases it is often better to dock into the binding site of a ligand-protein complex from which the ligand has been extracted in silico. Even when a naïve protein structure is suitable for docking, ligands can provide critical information about the location of the relevant binding site. Moreover, interactions with specific binding site residues illuminated by bound ligands have been successfully used to direct docking and to tailor scoring functions to specific target proteins. An extreme version of this is the use of docking to align molecules for CoMFA. Clark displayed lots of q2 values and models docked with Surflex but I found it hard to extract a take-home message from this talk.
Ansgar Schuffenhauer and his colleagues at Novartis have published a Pareto analysis of methods for classification of chemical structures by scaffold[1]. Rule-based methods such as that of Bemis and Murcko[2] scale linearly with the number of structures since the classification process is done individually for each molecule and incremental update is possible. The classes created by such methods are more intuitive to chemists than those produced by clustering and other methods. Schuffenhauer described a variation on Bemis and Murcko’s molecular frameworks. His hierarchical classification method[3] uses molecular frameworks as the leaf nodes of a scaffold tree. By iterative removal of rings, scaffolds forming the higher levels in the hierarchy tree are obtained. Prioritization rules ensure that less characteristic, peripheral rings are removed first, e.g., in order of precedence:
- Keep macrocycles with at least twelve atoms
- Choose the parent scaffold having the smallest number of acyclic linker bonds
- Retain bridged rings, spiro rings, and nonlinear ring fusion patterns in preference
- Remove rings of sizes 3, 5, and 6 first
- Remove rings with the least number of heteroatoms first.
Highlighting by color intensity is used to show the fraction of active compounds containing a scaffold: this immediately identifies those branches of the scaffold tree which contain active molecules. Schuffenhauer concluded that chemical series are not always equivalent to biological activity classes and what is actually desirable is continuous change in biological activity with the chemical variation in a chemical series.
Evotec have applied a spectral clustering method to 2D structures[4] and have found it particularly useful in the analysis of screening data. It provides a means to quantify the degree of intermolecular similarity within a cluster and the contribution that the features of a molecule make to a cluster. These two criteria can be used to arrange molecules into clusters of chemically related molecules and quantify inter-cluster relationships so that the resultant classification scheme appears intuitive from a medicinal chemistry perspective. Mark Brewer presented applications of the method to, for example, a data set of 125 COX-2 inhibitors.
|