In a previous post, we have developed the calibration
without any math-treatment with the Training Set from Instrument 1, without any
treatment , knowing that it was not the best choice, and we look to the LOO
(leave one out) cross validation errors to check the performance.
Now we develop the regressions with some anti-scatter
math-treatments to compare the cross validation errors and to decide which of
them performs better. Anyway the shootout supplies also a test set, so
validating with this test set will give us a better idea about how the
calibration is performing with independent data.
First thing to do is to convert the “X1. Training” and
“X1. Test” matrix to the math treatment we want to use: SNV (Standard Normal
Variate), Detrend , SNV + Detrend and MSC.
>nir.train1_snv<-data.frame(X=
I(X1_snv),Y=I(Y))
>nir.train1_detrend<-data.frame(X=
I(t(X1_detrend)),Y=I(Y))
>nir.train1_snvdt<-data.frame(X=
I(t(X1_snvdt)),Y=I(Y))
>nir.train1_msc<-data.frame(X=
I(X1_msc),Y=I(Y))
Now we can develop the PLS regressions:
>mod1_snv<-plsr(Y~X,data=nir.train1_snv,+
ncomp=10,validation="LOO")
>mod1_detrend<-plsr (Y~X,data=nir.train1_detrend,+
ncomp=10,validation="LOO")
>mod1_snvdt<-plsr(Y~X,data=nir.train1_snvdt,+
ncomp=10,validation="LOO")
>mod1_msc<-plsr(Y~X,data=nir.train1_msc,+
ncomp=10,validation="LOO")
We can plot the RMSEP values versus the
number of components (or terms), to have a better idea of the performance of
the models (black line is the model without mat-treatments or raw spectra, green line is with just Detrend, the rest (SNV, SNV+DT and MSC) are almost overlaped.
But if we want to see it with more details, we have to see the numbers provided by the summary of the models. I mark in yellow the smallest values.
But the question can be: Do we have to choose
the number of components which gives the small RMSEP?.
We will reply to this question soon.
In a next post we will do the same with derivative mixed with scatter corrections to see if we get better values for RMSEP and we will check it with an external validation (don´t forget that these RMSEP are for Cross Validation).
Este comentario ha sido eliminado por el autor.
ResponderEliminarDear José Ramón Cuesta, I have a question concerning a topic on you blog. Do you have an email address where I can contact you? you can contact me on: eva.ampe@limagrain.com
ResponderEliminarThank you very much
Best wishes
Eva Ampe
Hi Eva,
EliminarI have send you an email