QSAR WORLD
Home | About QSAR World | Strand Life Sciences | Contact Us
Google Custom Search

Expectations from a good QSAR tool in Drug Discovery Applications, December 2008

At every stage of the drug discovery pipeline, the application of QSAR is evidently beneficial, yet limited in its reliability in its current state. In this commentary, the various applications of QSAR are reviewed with respect to the drug discovery stages of compound library design, virtual screening, and lead optimization. The features required, performance expectations, and design constraints for an effective QSAR application vary significantly for each drug discovery stage, however, there are certain requirements that are common across all stages as well.

A QSAR software application could comprise of three functionally distinct modules: (i) ‘Model Building Software Tools’, (ii) ‘QSAR Models’ and (iii) ‘Model Deployment and Prediction Systems’.

(i) Model Building Software Tools:
 
A QSAR model building software toolset is expected to handle common molecular structure format representations, perform structure optimizations, compute or import descriptors and property values for the input compounds, contain a set of machine learning algorithms for building QSAR Models as well as methods to validate them.  

While QSAR modelers use a collection of statistical and computation chemistry software tools to achieve the above functions, very sophisticated specialty QSAR modeling software products are now available. These software products provide a broad selection of features useful at all stages of model building. With the implementation of current best practices, intelligent wizards and guided modeling workflows these products enable modelers of all skill levels to build good models of their data. Building the best models of any data is possible only by running the data through a wide range of statistical methods and model building algorithms over a wide range of parameter sets, as well as a variety of methods to validate the models and assess their robustness over intended ranges. Further, QSAR modeling software provides a rich interactive graphical interface for visual examination of data and results at all stages.


(ii) QSAR Models:
 
QSAR models can be categorized in a few ways. Depending on the type of end-point they are meant to predict, models can be activity, ADME, or toxicity models. Models are either global or local; local models are designed to predict over a small chemical space like a target focused library, a therapeutic class, or certain range of end point values, while global models are expected to cover a wider range of chemical space.

There are several ‘Pre-built’ QSAR models for ADME and Toxicity predictions available commercially and in the public domain. Most of the pre-built models available are ‘black boxes’ with little information about the applicability domain and the prediction confidence metrics available to the users of the models. There are some model providers, though, that provide abundant information about the models, such as the training compounds, range and distribution of end-point values used, the descriptor features used in building the model, the algorithms and parameter settings employed, and so on. When the training data set is packaged with the pre-built models, it allows modelers to “localize” or “globalize” them by sub-setting or adding new or in-house data and retraining these models.


(iii) Model Deployment and Prediction System:

Information that allow users of models to attach confidence to the predictions, like similarity of input compounds to the model training compounds in the chemical and descriptor spaces, would be an essential aspect of an effective software system through which models are deployed for users. The model deployment system should allow users to visually examine the effect of variations on the compounds, like R-group enumerations, on the predictions. More often than not, the users of models are not as sophisticated users of computer programs as the modelers, so a higher level of product design considerations for ease-of-use and intuitiveness are essential in designing model deployment and prediction systems.

QSAR models are commonly built and “thrown over the wall” for users. Focus is seldom on proper ways to collect information on performance and usage of these models. This information feedback would be vital to model builders to continually improve the predictive performance of the models. This also allows organizations to assess the value addition of the QSAR technology applications to their research efficiency. An effective model deployment system should focus on keeping the models updated. New data, especially data on compounds for which decisions were made upstream based on QSAR model predictions, should be made available to tune and improve the models as and when it becomes available from the labs.
1 | 2
Have any Questions?
Name:
Email:
Enter your query/comment here
 

    Facilitated by
    Strand Life Sciences Pvt. LtdStrandls Logo