New Horizons in Toxicity Prediction.
Lhasa Limited Symposium Event in Collaboration with
the University of Cambridge - February 2009
A Report by Wendy A. Warr, wendy@warr.com, http://www.warr.comRecent
developments in toxico-cheminformatics: supporting a new paradigm for
predictive toxicology
Ann Richard, EPA
“A major focus for the future of computational toxicology
will be
integration and analysis of large data sets. The current state of
toxicity databases is something of a mess. There are a number of
databases, each with differing content, architecture, and
searchability, that makes the task of integration extremely
difficult.” Lawrence Marnett, editorial in Chemical Research
in
Toxicology.
The Distributed Structure Searchable Toxicity (DSSTox) public database
and website11 provide a public forum for
publishing downloadable,
structure-searchable, standardized chemical structure files associated
with toxicity data. The data are put into a model where they are easier
to manipulate. Data are deposited in PubChem: 11
DSSTox
“bioassays” are already in PubChem. Structure
search is
possible in DSSTox and there are links out to other resources:
ChemSpider, PubChem, the EPA Aggregated Computational Toxicology
Resource (ACToR), Lazar in
silico tox,12 the
National Toxicology
Program (NTP), the National Center for Biotechnology Information
(NCBI), and the European Bioinformatics Institute Outstation of the
European Molecular Biology Laboratory (EMBL-EBI). EPA is linking people
to information, with chemical structure as the key, and is working
toward a public toxico-chemogenomics capability by chemical indexing of
EMBL-EBI and links to NCBI Gene Expression Omnibus (GEO). Structure and
similarity searches can be used to produce a meta data set for a given
chemical.
A National Academy of Sciences (NAS) panel has called for a major shift
in how EPA assesses the toxicity of chemicals.1
In 2007, EPA launched
the ToxCast program13 in order to develop a
cost-effective approach for
prioritizing the toxicity testing of large numbers of chemicals in a
short period of time. Using data from state-of-the-art high throughput
screening (HTS) bioassays developed in the pharmaceutical industry,
ToxCast is building computational models to forecast the potential
human toxicity of chemicals. The goal is to derive
“signatures” from in vitro and in silico assays to
predict
in vivo endpoints.
In its first phase, ToxCast is profiling over 300 well-characterized
chemicals (primarily pesticides) in over 400 HTS endpoints. Various
chemical classes and diverse mechanisms of action are included. In vivo
data have been extracted from PDF, TIF files etc., and put into a
relational database, ToxRefDB. ToxCast will have millions of dollars
worth of in vivo
chronic and cancer bioassay effects and endpoints.
ToxRefDB has been used in profiling of liver effects for pesticides.
Liver non-neoplastic histopathology and increased organ weight are
often associated with tumors and cancer. The activity profile of a
compound is the refined “endpoint” for SAR
modeling.14 Nine
EPA contracts provide chemical procurement; hundreds of biochemical,
cellular, tissue and genomic assays; model organisms; and the capacity
to screen up to 10,000 chemicals.
SAR concepts are being incorporated into ToxCast. The system holds
chemical structures, HTS data (“fast biology”) and
bioassay
(in vivo)
data (“slow biology”). We have to use
“fast
biology” to begin to address the backlog of untested
chemicals.
One SAR approach to toxicity prediction is global modeling and another
is chemical class-based modeling. A third approach, using a bioactivity
profile of a structure class is richer information. Chemical structure
classes are identified by clustering according to activity and
mechanism. Differences in activity profiles can discriminate within a
structure class; a bioactivity profile class can be projected onto
multiple chemical classes. This gives potentially broader coverage of
chemical space and implies mechanistic similarity. HTS assay data,
positive or negative, is incorporated as biological
“descriptors”. In
vivo activity clusters can also
be used.
It could be that biology predicts chemical similarity better than
chemistry predicts biological similarity.
The 320 pesticides in ToxCast have been deposited in PubChem. As you
move away from the 320, there are fewer and fewer in vivo data.
Phase I
of ToxCast is proof of concept. Later phases will produce an affordable
science-based system for categorizing chemicals. There will be
increasing confidence as the database grows. ToxCast will identify
potential mechanisms of action, and refine and reduce animal use for
hazard identification and risk assessment.