R & Chemometrics: septiembre 2016

28 sept 2016

Looking to the external validation statistics (SEV and SEV(C))

Whenever we transfer calibration databases from one instrument to be used in another type of instrument, we use a type of standardization, scanning samples in both instruments and after that we apply the standardization to the data base and make a new calibration. We need to validate this equation with new samples from the new instruments in order to see if the transfer was correct. But it is usual that the equation can underfit or overfit the number of terms used in the PLS Model. So whe we do the validation probably will in some cases some bias effects or a high SEP than expected.

Is it good to develop the equation again using the new spectra with lab values as an external set for validation in order to decide the number of terms we will use in order to prevent the calibration to be under-fitted or over-fitted.

Just look to the statistics values of the SECV and SEV (SEP for the external validation set) and make your decision.

It is important to look at the same time to the SEV and SEV(C) to check that we have not a bias in the prediction of the validation test.

In the statistic list Win ISI recommends 14 terms for a moisture equation, but we can see clearly that is too much, so we can take the decision to take less. What about four?. Just try.