I continue with the exercise of Tecator data from the :
Chapter 6 | Linear Regression and Its Cousins
in the book Applied Predictive Modelling.
In this exercise we have to develop different types of regression and to decide which performs better.
I use for the exercise math treatments to remove the scatter, in particular the SNV + DT with the package "prospectr".
After I use the "train" function from caret to develop two regressions (one with PCR and the other with PLS) for the protein constituent.
Now the best way to decide is a plot showing the RMSE for the different number of components or terms:
Which one do you thinks performs better?.
How many terms would you choose?
I will compare this types of regressions with others in coming posts for this tecator data.