21 mar 2019

Overploting residual spectra of Training and Test sets (Good Product)

After we have develop a Prediction Model with a certain number of Principal Components, there is always a residual matrix spectra with the noise not explained by the Model. Of course we can add or reduce the number of PCs, but we can overfit or underfit the model increasing the noise in the model or leaving interested variance in the Residual Matrix.
 
This residual matrix is normally called "E".
 
Is interesting to look to this matrix, but specially for detection of adulterants, mistakes in the proportions of a mixture or any other difference between the validation samples (in this case in theory bad samples) and the training matrix residuals.
 
In this case I overplot both for a model with 5 PCs (in red the validation samples residual spectra and in blue the training residual spectra).


We can see interesting patterns that we must study with more detail to answer some questions, about if the model is underfitted, if we see patterns enough to determine if the validation samples have adulterations or changes in the concentrations of the mixture ingredients and so on, or if there are for some reasons in the model samples that should have been considered as outliers and be taken out of the model.


No hay comentarios:

Publicar un comentario