Accuracy of Prediction
Kubinyi has another simple explanation of
the prediction paradox [1]. Even in the absence of real outliers,
external prediction will be worse than fit because the model tries to
'fit the errors' and attempts to explain them. Accordingly, external
predictions contain the model error and the experimental error. When
variable selection is carried out, no independent variable selection is
performed in the cross-validation runs; correspondingly, variables that
were included to 'explain the error' remain in the model and cause
wrong predictions [1]. The higher the number of descriptors relative to
the number of compounds, the higher is the chance to select those of
them that give high q2 values [8]. Other reasons for overestimating q2 are redundancy in the training set, or, in the case of non-linear methods, the existence of multiple minima [8].
Arthur Doweyko has also published on the
elusive nature of 3D QSAR predictions [10], but concludes: "Predictions
can be enhanced when the test set is bounded by the descriptor space
represented in the training set. Interpretation of significant
interaction regions becomes more meaningful when alignment is
constrained by a binding site."
At a workshop held in Setubal, Portugal in
2002, a set of principles was proposed to define the validity and
applicability domain of QSAR models. These then evolved into the OECD
principles in 2004 [11]. Paola Gramatica discusses three of these
principles in a recent publication [12], and in particular, emphasizes
the need for external validation using at least 20% of the data.
Gramatica, Tropsha and others believe that validation is the absolute
essential for successful application and interpretation of QSAR models
[3, 13].
The necessity for validation has been
accepted by leading journals. The policy of J. Chem. Inf. Model. on
QSAR manuscripts [14] has been adopted by other journals such as J.
Med. Chem. [15] and ChemMedChem. In part, it states: "If a new
method/theory is being reported in the paper, it should be compared and
'validated' against at least one other common data set for which a
published study exists, using at least one other method/approach and
preferably a method/approach that has been widely used in the field.
The data set should not be small... Evidence that any reported
QSAR/QSPR model has been properly validated using data not in the
training set must be provided."
|