Training
Training involves looking at a dataset with known values, and learning a model from the dataset, using some algorithm. Using the training data, the algorithm derives appropriate weights, coefficients or some other measures that best minimize some pre-determined error function and yield the most predictive model. Depending on the algorithm, it may or may not also extract the most relevant descriptors from the given data for capturing the property of interest well. In the latter case, one may need to do a pre-selection and provide the most relevant descriptors to the algorithm in order to learn statistically appropriate and interpretable models.
Models that fit the training dataset rather well may not predict new data points correctly. Such over-fitting of the training data mostly yields a model that cannot be generalized and, therefore, is not useful. Learning models under ‘cross-validation’ mode helps avoid this pitfall somewhat.
See Also:
Cross Validation
References:
Cite This As:
Dogra, Shaillay K., "Training" From QSARWorld--A Strand Life Sciences Web Resource.
http://www.qsarworld.com/qsar-ml-training.php
|