New Horizons in Toxicity Prediction.
Lhasa Limited Symposium Event in Collaboration with
the University of Cambridge - February 2009
A Report by Wendy A. Warr, wendy@warr.com, http://www.warr.com Models
and databases for genetic/carcinogenic toxicity
Romualdo Benigni, Istituto Superiore di Sanitá (ISS), Rome,
Italy
In the framework of a collaboration between the ISS and the European
Chemicals Bureau (ECB), a series of non-commercial (Q)SARs for
mutagenicity and carcinogenicity have been evaluated.23
These include
structure alerts, and QSARs for congeneric classes of chemicals.
Structure alerts are a coarse-grained approach to SAR, whereas QSARs
are fine-tuned.
Knowledge about the action mechanisms as exemplified by structure
alerts is routinely used in SAR assessment in a regulatory context. In
addition, alerts are at the basis of popular commercial systems such as
Derek for Windows. Benigni and co-workers identified four structural
alert models as particularly promising.24-27 The
four did not differ to
a large extent in their performance. In the general databases of
chemicals the alerts appear to agree around 65% with rodent
carcinogenicity data, and 75% with salmonella mutagenicity data.
The alert-based models do not seem to work equally efficiently in
discriminating between active and inactive chemicals within individual
chemical classes. Thus, their main role is that of preliminary, or
large-scale screenings. They are excellent tools for coarse-grain
characterization of chemicals, for example description of sets of
chemicals, preliminary hazard characterization, category formation and
priority setting (enrichment). A priority for future research is the
expansion of structural alerts to include alerts for nongenotoxic
carcinogens.
Based on the experience gathered from the above survey on the structure
alerts, a rule base for mutagens and carcinogens has been designed and
implemented in Toxtree 1.50.6 It uses a structure-based approach
consisting of a new compilation of structure alerts, for both
genotoxicity and nongenotoxicity. It also offers three mechanistically
based QSARs for congeneric classes (aromatic amines and aldehydes).
In the same survey, local QSARs for congeneric classes were short
listed based on the following criteria: interpretability from a
scientific (mechanistic) point of view, good internal statistics, and
domain applicability. A crucial point is that of
“validation”. Whereas it is generally accepted that
the
gold standard is to test the model on a set of chemicals not used for
the derivation of the model, in practice many investigators use
different statistical procedures to generate artificial test sets, for
example, splitting the chemicals into training and test sets. On the
contrary, in this survey the short listed QSARs were challenged to
predict the activity of external sets of chemicals, never considered by
the authors.
Benigni presented tables summarizing the external prediction outcomes
for regression based models (i.e., QSAR models for potency), and the
outcomes for discriminating models (i.e., QSAR models for activity).
The two tables reported also parameters for goodness of fit and
different internal validations of the training set. In summary, all the
short listed local QSARs are scientifically interpretable and have good
internal statistics, but they vary in their external predictivity. In
QSARs for potency the predictions are 30–70% correct and in
QSARs
for activity the predictions are 70–100 % correct. Estimating
intervals is more reliable than estimating points. In addition, it
appears that internal validation measures do not correlate with
external predictivity.28
Mechanistically-based models should be preferred, since this gives a
common ground for modelers, toxicologists and regulators, and provides
an additional tool for minimizing chance correlations, and intelligible
information for synthesizing safer chemicals. Unfortunately, existing
local, mechanistic QSARs are limited in number and the mechanistic
understanding of many human health effects is not possible at this
time. In many instances there is no alternative to models for
noncongeneric chemicals aimed at modeling simultaneously
“all” chemical classes. There are many commercial
systems
of this type. Often they use non-mechanistically based descriptors and
offer no mechanistic interpretation. They are mostly validated through
internal statistics alone. Independent external validation studies of
these models have pointed to a great variability of their predictivity
in the different regions of the chemical space.
The recent progress in the technology and availability of chemical
relational databases provides new opportunities to QSAR modeling.29
New
fine-tuned QSARs can be created by intelligent interrogation of
databases. For example, a published QSAR model for the mutagenicity of
αβ-unsaturated aldehydes has been proposed by
Benigni to the
European Food Safety Agency’s FLAVIS group for their priority
setting of αβ-unsaturated carbonyls.30
Since ketones
were
not considered in the paper, databases were interrogated and data on
their mutagenicity were retrieved. This permitted the generation of a
new mechanistically-based QSAR model for the mutagenicity of the
αβ-unsaturated ketones (Benigni, unpublished).