15 mar. 2015

Selecting samples for lab analysis (part 4)

After the last post, we have a training set with 30 samples to develop a calibration (due to the few samples, this exercise is just a study in order to create a model with more samples in the future), and a test set with another 30 samples. This test set is useful to tune the model as best as we can (math treatments, number of terms,...).
We just try with the parameter protein, developing a PLSR with the Training Set, and after that predicting the Test set. The prediction values for the Test Set are compared with the Test Set Lab values in order to get the performance statistics.
Validation statistics are compared with the Regression statistics.
Finally a model with the Training + Test Set is developed (60 samples).
All the statistics are studied with a monitor function.
See the script and follow the tutorial in Github.
Details of the plots in the monitor function:

