14 mar 2012

NIT: Fatty acids study in R - Part 007

Once we have chosen the model, we can continue acquiring spectra of new samples. Spectra is exported to a txt or csv file and we imported in R to be reprocessed.
We use the function “predict” from the PLS package. I have done this with 20 new samples. We need first to apply to them the adequate math treatment (same as the used in the model). I call this sample set for prediction “fatt2ac_val”, after apply the “msc” math treatment.
So, let´s see the predictions:
> predict(C16_0,ncomp=12,newdata=fatty2ac_val)
, , 12 comps
    C16_0
220   22.01807
221   20.44803
222   19.79991
223   21.64232
224   20.29058
225   20.20099
226   21.52053
227   19.83305
228   18.95492
229   21.39239
230   21.11044
231   20.67454
232   19.28662
233   20.97292
234   21.70614
235   20.27464
236   19.70897
237   21.30686
238   20.21069
239   19.21576
I have used the model with 12 terms.
If this data has more than the spectra, (the Lab values) we can also validate and to check the number of terms to use.
> RMSEP(C16_0,newdata=fatty2ac_val,ncomps=12)
(Intercept)      1 comps      2 comps      3 comps      4 comps      5 comps 
     1.6280       1.5987       1.2554       1.1071       1.3447       0.9122 
    6 comps      7 comps      8 comps      9 comps     10 comps     11 comps 
     0.8754       0.8413       0.6597       0.6154       0.5669       0.5791 
   12 comps     13 comps     14 comps     15 comps     16 comps 
     0.5935       0.6261       0.5994       0.6315       0.5787

We can see our RMSE error and compare it with the RMSEP obtained in the PLSR “LOO validation statistics” which was 0,5733.We can see also that we would get even lower values for validation with a lower number of components (10).
It seems that the is working almost as expected, but let´s have a look to the plots:
>predplot(C16_0,ncomp=7:15,newdata=fatty2ac_val,
+ asp=1,line=TRUE)
I can see a Bias, of almost 0,50.
The error corrected by the Bias would be 0.34.
Bias can be due to diferent reasons (temperature, sample presentation,particle size,optical path,state of the instrument,...).
This samples are fat triturated as a paste, and put it in a petri dish, and place it in an instrument in transmitance.
The next step would be to add this data to the data base and recalibrate.
Divide the data into a Calibration and a Validation Set. Be sure that in the validation set there are some samples of this last set.
The idea with all this series of post related to this data set was to work with data I used almost daily in my job, and I wanted to see how to proceed in R, and once you get use to it the results are very good for the understanding of chemometrics for multivariate data. I will continue exporting some of my data sets to work in R. The analysis of Fatty Acids by NIR/NIT can give good results for some of them (C16:0, C18:0, C18:1,..).


No hay comentarios:

Publicar un comentario