16 feb 2020

Tidyverse and Chemometrics (Part 13): Monitoring Fish2 predictions

Now that we have the 17 selected samples with the lab values, it is time to process their spectra through the model to see their predictions and to compare the predicted and reference values to compare them and get a serial of statistics, which will tell us if the model is performing adequately.

In the histograms we have seen that the model should extrapolate in some way to predict the extension of range that some fish2 selected samples are requiring. We know that 40 samples are not a great number of samples but it can be used as starter calibration that we will improve as we add more samples with a structure procedure, selecting the samples that increase the model´s variability so we reduce the number of samples to send to the laboratory.

To predict the samples we need them converted with the mat treatment used for the samples which participate in the model (the 40 samples of Fish1), after that in the case of the PLS model we use the function "predict" to get the results for the four constituents: "Dry Matter", "Protein", "Fat" and "Ash".
 
fish2_pls_prot_pred<-predict(fish1_prot_plsr,
                     ncomp = 8,
                     newdata = fish2_sel_df$nir_1d)
fish2_pls_dm_pred<-predict(fish1_DM_plsr,
                     ncomp = 3,
                     newdata = fish2_sel_df$nir_1d)
fish2_pls_fat_pred<-predict(fish1_FAT_plsr,
                      ncomp = 4,
                      newdata = fish2_sel_df$nir_1d)
fish2_pls_ash_pred<-predict(fish1_ASH_plsr,
                      ncomp = 9,
                      newdata = fish2_sel_df$nir_1d)
 
Now we create a table with four columns of predictions
 
fish2_sel_pred<-cbind(fish2_pls_dm_pred,
                fish2_pls_prot_pred,
                fish2_pls_fat_pred,
                fish2_pls_ash_pred)
 
and merge it with the data frame which contains the sample numbers and laboratory reference values for these samples.
 
fish2_selected_monitor<-cbind(fish2_selected,fish2_sel_pred)
 
Now we can use the Monitor package to study the statistics. The Monitor package is not an available package by the moment. I have been developing this package and at the moment I am writing the documentation to see if it can be accepted by R, if this it is not possible I will load it on GitHub for people interested on it.
 
The Monitor package apart of many statistics create a serial of plots, I I include in this case the XY plots for the predictions of the 17 samples of Fish2 with the Fish1 Model. As expected we have a Bias which indicates that the model it is not robust yet, but the SEP (error corrected for the bias) it is quite good. The idea is that we have 57 samples to develop a new calibration and we will find another future validation set to test it.
 
library(Monitor)
# For Dry Matter
monitor_xyplot(fish2_selected_monitor[,c(1,4,8)])

#For Protein
monitor_xyplot(fish2_selected_monitor[,c(1,5,9)])

#For Fat
monitor_xyplot(fish2_selected_monitor[,c(1,6,10)])

#For Ash
monitor_xyplot(fish2_selected_monitor[,c(1,7,11)])
 
 
 

No hay comentarios:

Publicar un comentario