12 sept 2022

NIT Spectroscopic Tutorial with Caret (part 3)

To follow the tutorial yo can see first:
We use two pretreatments in the previous posts to develop the Principal Components Analysis, but we can add another to remove the skewness of the predictor variables. We can see the skewness plotting a histogram of the absorptions at every wavelength, there will be 100 histograms to look, so the best way to check it can be a boxplot spectra (centred and scaled):

boxplot(train_scaled, main = "preProcess(Center + Scale)")

In the plot we can see that are skewness to the right, and some outliers.

Now we can apply the BoxCox algorithm together with Center and Scale and to look at the boxplot spectra:

train_scaled_2 <- preProcess(absorp_train,
                     method = c("BoxCox", "center", "scale"))

boxplot(train_scaled_2, 
        main = "preProcess(BoxCox + Center + Scale)")

The result shows how the skewness is removed. Of course, the PCA calculation will give different scores values, but still two terms will explain almost all the variance.

Looking to the spectra (treated either with “Center” and “Scale”, or “BoxCox”, “Center” and “Scale”) we see that there is a baseline shift that would be convenient that a pretreatment will remove in case we have to see clearly the variability of the fat, protein or moisture content.

No hay comentarios:

Publicar un comentario