22 ene. 2017

External validation to make our model more robust

As we know cross validation help us in Win ISI to select the number of term that we will use in the equation. But as we saw in other posts we can use an external validation set. This external validation set can be designed in deferent ways.
  • We can split the data set into a validation and a calibration set, so we will use the validation set as a external validation set. There are different ways to split the data.
  • We can use a validation set with samples from other providers and scanned in a different instrument and with lab data from a different laboratory.
This second point is interesting, because it help us to select the number of terms in a way that our equation is useful for our instrument and for others, making our model more robust.
If we export the results from the model calculation to Excel, we can see graphically, which is the best option.
I have done this with data from cake meal (moisture) and I have a large validation data from another instrument, different provider and different laboratory. I have used two different math treatments, so I can check also which performs better:

We can see how the external validation is performing with a strange pattern but it will become stable from a certain number of terms. This graphics will help us a lot to select a robust calibration for this two instruments, but we can do the same with other validation sets with spectra and lab data from different instruments and labs.