Most of the computed descriptors differ in the scales in which their values lie. One may thus want to normalize them before proceeding with further statistical analysis. As part of normalization, each value for a given descriptor (all values in a column) is adjusted or shifted by the mean value. As a result, the new mean value becomes 0. This happens for all the descriptors and they thus now have the same mean value 0. Hence, mean, as a measure of central location of the distribution of values, for all the descriptors, is now the same. However, the 'spread' or the 'variation' in the data, about the mean, is still the same as in the original data. This can now be taken care of by scaling the values with the standard deviation.
This is best illustrated with an example. Consider these numbers: 1, 2, 3, 4, and 5. The total of these numbers is 15 and the mean is 3. Adjusting each value by the mean value gives the transformed numbers as: -2, -1, 0, 1, and 2. The new total is 0 and thus the new mean is 0. However, note that the standard deviation is still the same as original (√2). This can now be taken care of by scaling the values with standard deviation in order to make the new standard deviation as 1.
Cite This As:
Dogra, Shaillay K., "Mean Shifting." From QSARWorld--A Strand Life Sciences Web Resource.