R & Chemometrics: PCR vs. PLS (Part 11)

5 mar 2018

PCR vs. PLS (Part 11) - Independent Validation

One of the interesting things of the development of a calibration for a particular application (meat meal in this case) is to install it to work in routine and to wait for a long period of time to have a validation set which is really independent, so we can have a true statistic for the application error.

If you have followed the previous post, the calibration for meat meal has been developed in a DA1650, and we have used only the odd data from a sample set as training set, leaving the even samples for validation, so the calibration has almost 650 samples from the 1300 possible. The constituent we are interested to calibrate is the ash, but of course the others as protein, fat and moisture are interesting.

we try to develop a robust regression, so the number of terms is important and we don´t want to clean the outlier samples too much.

The range is quite wide, because we have samples from different sources (instruments, species, lab reference values,....).

In the previous post we saw the statistics, and they could seem quite high, but let´s see the validation.

The validation set is from one of the sources (one of the DA instruments, mix meat meat, and one of the labs). The samples from this source are previous to 2017, so the validation samples were acquired during 2017 and the firsts months of 2018. So this is an interesting set to check.

We will use the calibration (as I said) from the previous post which use 15 terms.

In gray we see the XY plot of the Cross Validation for the training set, and in blue how the new validation samples fit to the regression line.

gm_mult_val_plspred<-as.matrix(predict(mm_da_odd_pls,ncomp=15,newdata=gm_mult_val$NIR))
rms(gm_mult_val$Ash,gm_mult_val_plspred)
plot(mm_da_odd_pls,"prediction",ncomp=15,col="grey",
xlim=c(0,50),ylim=c(0,50))
par(new=TRUE)
plot(gm_mult_val_plspred,gm_mult_val$Ash,
xlim=c(0,50),ylim=c(0,50),col="blue",
xlab="",ylab="")
legend("topleft",legend=c("Odd 15 terms",

"VAL-2017-MIX"," VAL-2017-MIX-RMSEP 15t=1,47"),
col=c("grey","blue","blue"),pch=c(1,1,1), cex=1,
bg=)
abline(0,1)

It is working well in routine, and I am quite happy with the result.

R & Chemometrics

5 mar 2018

PCR vs. PLS (Part 11) - Independent Validation

No hay comentarios:

Publicar un comentario