3 may 2021

Working with Soilspec data (part 5)

Before to proceed with the development of principal component analysis or the regressions we have to know our data as best as we can, that is the reason we continue organizing and looking to the data. One of the important options is to check the correlation between the constituents (Clay, Sand, Silt and Total Carbon in this case), and for that purpose, we use the "cor" function:

cor<-cor(parameters)

Now, the "corrplot" package to see graphically the correlation between the constituents:



As we can see, there is a high inverse correlation between sand and clay. Silt and total carbon have lowest correlations with all the rest. Is important to take this into account during the calibrations devellopment.

We can sort the spectra in descending order, to look at the spectra with the highest values for sand, silt or total carbon. That way, we can find patterns that could help us to interpret the loadings, coefficients, and other thigs during the calibration process (this is not easy due to the math treatments or interactions), anyway we do it for sand and clay, and we look to the "top 5":

#### SORT THE SPECTRA BY CONSTITUENTS
## By Sand
sand_sort <- datsoilspc %>%
    arrange(desc(sand))
matplot(wavelengths, t(sand_sort$spc[1:5, ]), type = "l",
        xlab = "wavelength", ylab = "Reflectance")

 


 
## By Clay
clay_sort <- datsoilspc %>%
  arrange(desc(clay))
matplot(wavelengths, t(clay_sort$spc[1:5, ]), type = "l",
        xlab = "wavelength", ylab = "Reflectance") 


 

 As we can see there is a much more sharp peaks for samples with hig concentration of clay that for sand.

All this strategies can help us to understand better the data set.

 


No hay comentarios:

Publicar un comentario