6 sept. 2012

Unscrambler (Jam Exercise) - 003

In the Jam exercise we have 3 groups of variables:
Preference: 114 representative consumers tasted the 12 jam samples and gave their scores in a scale from 1 to 9.The data on this variable is the mean value for each sample. This is the profiling of jam quality.
Sensory: Trained sensory taste panelist judged the 12 jam samples giving their scores for 12 variables.
Instrumental: It is the measure of 6 chemical and instrumental variables. This is the cheapest method.
We have develop in the post “Unscrambler (Jam Exercise) - 001“ a PCR using Sensory as the “X” matrix and  Preference as the “Y” (constituent matrix).
We have develop in the post “Unscrambler (Jam Exercise) - 002“ a PLS1 using Sensory as the “X” matrix and  Preference as the “Y” (constituent matrix).
Other alternative could be to use Instrumental as “X” and “Preference” a “Y”.
Now we are going to develop a PLS2 regression using “Instrumental” as the “X” matrix and “Sensory” as the “Y” matrix.
PLS2 allow several variables in the “Y” matrix at the same time.
Which of the variables from Y (expensive sensory method) can be determined by X (cheapest instrumental method)?.
When developing the PLS2 regression we obtain this overview plot:

We see in the upper left plot how the first term PC1 explains most of the variability due to harvest time.
Lower left plot give us the explained variance for every Y parameter. We don´t want to use too many to avoid overfitting, so if we look to this plot carefully:
We see how we explain mainly (in PC2), Sweetness,Redness,Colour and Thickness.
The reason for this is seen in the loading plot.
 We should add more variables in Y from other instruments or chemical analysis, in order to see if we can explain some others X variables.

No hay comentarios:

Publicar un comentario