New Horizons in Toxicity Prediction.
Lhasa Limited Symposium Event in Collaboration with
the University of Cambridge - February 2009
A Report by Wendy A. Warr, wendy@warr.com, http://www.warr.com
In
silico tools and guidance developed by the Joint Research Centre
Andrew Worth, European Commission Joint Research Centre (JRC)
Under the Registration, Evaluation, Authorisation and Restriction of
Chemicals (REACH) regulation information on intrinsic properties of
substances may be generated by means other than tests, provided that
certain conditions are met, so animal testing can be reduced or avoided
by replacing traditional test data with predictions or equivalent data.
Integrated testing strategies (ITS), including in vitro assays, QSARs,
and “read-across”, can be used in a combined
“non-testing” strategy, i.e., as an alternative to
the use
of animals. In read across, known information on the property of a
substance is used to make a prediction of the same property for another
substance that is considered similar. This avoids the need to test
every substance for every endpoint, but there are conditions. QSARs are
allowed under REACH if the method is scientifically valid, the domain
is applicable, the endpoint is relevant, and adequate documentation is
provided.
Step 1 in the tiered ITS approach is information collection: the
European chemical Substances Information System (ESIS)4
has been
developed, together with some more specific databases. Step 2 is the
preliminary assessment of reactivity and fate. Commercial software and
databases are available but JRC has chosen to develop some freely
available and open-source software:5 CRAFT
(Chemical Reactivity
&
Fate Tool), START (Structural Alerts in Toxtree6)
and the OECD
Toolbox.7 CRAFT and START are being developed in
collaboration with
Molecular Networks of Germany. A bewildering array of SAR and expert
system tools could be used, but again JRC has concentrated on freely
available and open-source software such as Toxtree6
and the OECD
Toolbox. Toxtree is an application which is able to classify chemicals
into modes of action and estimate toxic hazard by applying decision
tree approaches. It is being developed in collaboration with
Ideaconsult of Bulgaria. DART (Decision Analysis by Ranking Techniques)
is a flexible, user-friendly, open source application, which is able to
rank and group chemicals according to properties of concern. This is
developed in conjunction with Talete, Italy. Toxmatch8
is a chemical
similarity tool which supports chemical grouping and read across. In
the interests of international collaboration and harmonization, the JRC
is also contributing to the development of the OECD QSAR Toolbox.
Finally, the JRC QSAR Model Database is an inventory of information on
(Q)SAR models (also developed in collaboration with Ideaconsult). This
can be searched in various ways including substructure and similarity
search. Further guidance is needed on how to assess the adequacy of
non-testing data by weight-of-evidence approaches.
Modeling
and informatics
support for safety and metabolism studies in early drug discovery
Scott Boyer, AstraZeneca
Drug
candidates may fail because of target pharmacology, off-target
pharmacology, or chemically related toxicity. As a generalization,
on-target
pharmacology (efficacy) is easy; the other two areas (safety) are hard.
A
pharmacologist’s view of Cyclooxygenase 2
(COX-2) is simple; a toxicologist’s view is complicated. The scientist
must insure that the “obvious” compound liabilities
(cardiac arrythmias,
genetic toxicity, hepatotoxicity) are addressed, and must use
hypothesis
generation when things go wrong.
Human Ether-a-go-go Related Gene (hERG)
encodes an
ion channel, abnormalities in which may lead to either long or short QT
syndrome, both of them potentially fatal cardiac arrhythmias. In in silico prediction of hERG activity in
drug discovery, the models get more sophisticated as the pipeline is
traversed.
As a general strategy, most models are tuned to enhance the negative
prediction
rate, since false negatives in safety are expensive, and positives are
tested
if they are real compounds, and reprioritized if they are virtual ones.
Because
the interactions in hERG mechanisms are diverse, chemical descriptors
must be diverse: a docking score for size/shape
complementarity, pharmacophore features (correct spatial orientation of
features), and traditional descriptors such as physicochemical
properties. AstraZeneca
gets consistently better results from a consensus prediction using all
three.
Local QSAR
models are validated to make sure that they can
predict the future but models lose their accuracy over time: as the
chemical
space expands the quality of prediction degrades. At AstraZeneca
machine
learning is automated and QSAR models are used by chemists in library
design. It is very important that the system is user-friendly or the
model will
not be used. The system could have predicted that the antihistamine
Allegra
(fexofenadine) would be “safe” and Seldane
(terfenadine) “unsafe”. (Seldane is
thought to have been involved in more than 10 hERG related deaths.)
Results
such as these are recorded in the AstraZeneca system with a link to the
full
text of the original publication to give the chemists evidence they can
believe. In 2003 more than a quarter of compounds in the
company’s compound
collection were predicted to be hERG blockers. This trend has been
reversed
since 2004 when multiple computational and experimental
hERG methods were
introduced.
AstraZeneca’s
Genetox database has
non-validated data from the
Chemical Carcinogenesis Research Information System (CCRIS),
FDA-approved data
from MCASE, the quality of which is roughly known, and data of known
quality
generated in-house. The Ames
risk assessment system runs
automatically and by “inverse QSAR” shows the
chemist which substructure is
most significant for a negative or positive prediction.
There are
more than 10 different pathologies for hepatotoxicity.
Reactive metabolites should be avoided if possible. AstraZeneca uses
essentially the same procedure for structural warnings as it uses for
hERG.
Glen, Boyer and colleagues have shown how predictive
metabolism methods in drug
discovery projects can be used to enhance the understanding of
structure-metabolism relationships.9 In the SPORCalc
system
the Symyx
Metabolite database was mined to
exploit biotransformation data. Reaction center fingerprints were
derived from
a comparison of reactants and products to give two fingerprint
databases: all
atoms in all reactants and all reacting centers. The metabolic reaction
data
are then mined by submitting a new molecule and searching for
fingerprint
matches to every atom in the new molecule in both databases. A
normalized occurrence
ratio derived from the fingerprint matches enables the search results
to be
rank-ordered as a measure of the relative frequency of a reaction
occurring at
a specific site within the submitted molecule. Boyer has also worked with Mestres’s team
on biological fingerprinting. using
SHED molecular descriptors.10
Hypothesis
generation is critical
for rapid problem solving.
Boyer’s
final comments concerned physicochemical properties.
In work as yet unpublished, he and Tudor Oprea have used the maximum
recommended therapeutic dose (MRTD) data and
“classes” from Matthews et al.
(2004) now available in DSSTox11. The MRTD
classes were defined as
low (active), medium (marginal) and high (inactive). The classes were
compared
in terms of logP and volume of
distribution. Low MRTD was indicative of toleration problems. Low MRTD
drugs
are more lipophilic, interact with more targets, and are more widely
distributed. Optimization of ligand efficiency is important in lead
selection.
In summary,
QSARs should be accurate, to the point the data
will allow, should reflect a testable endpoint, and should be supported
by
interpretations and past experience. Data mining should reflect summary
data in
terms of structure, and help develop focused hypotheses and
experiments. Control
of physicochemical properties is critical. In the discussion session
after his
talk, Boyer remarked that logP
estimation is pretty good, but pKa
estimation is pretty poor. Unfortunately, logD,
which is what matters, depends on pKa.
|