14 oct 2021

Modelling complex spectral data (soil) with the resemble package (IV)

Continuing with the vignette Modelling complex spectral data with the resemble package (Leonardo Ramirez-Lopez and Alexandre M.J.-C. Wadoux) 

Now we will use the PCA with the method “opc” in order to find the optimal number of components, bases on the its rationale behind that if two spectra are close in the X space (near neighbors), their constituents values will be closer as well on its value, so the optimal number of components will be the one that makes minimum the RMSD (root mean square difference) between them.

For more details you can find more info from the developers of this algorithm : L. Ramirez-Lopez, Behrens, Schmidt, Stevens, et al. (2013

optimal_sel <- list(method = "opc", value = 40)

pca_tr_opc <- ortho_projection
              (Xr = training$spc_p,
               Yr = training$Ciso,
               method = "pca",
               pc_selection = optimal_sel)

pca_tr_opc     # to obtain details of the PCA calculations.

 

We specify a maximum value of 40, and the “opc” method estimate that 11 is the best option. If we plot it, we can see graphically the reason:

The vignette shows an interesting code, that if you run it will get the XY plot of the reference Ciso value (for every spectrum) and the reference Ciso value for its closer neighbor, and we can se a high correlation what is really the idea behind the “opc” method.

I found these option very interesting, so we will continue exploring the vignette that sure will help for the purpose of its title.



No hay comentarios:

Publicar un comentario