This is the post number 8 about the vignette " Modelling complex spectral data with the resemble package (Leonardo Ramirez-Lopez and Alexandre M.J.-C. Wadoux) ", where we try to understand the way Resemble package works, and to use their functions to analyze complex products as the case of soil.
We have seen how to calculate the dissimilarity matrix for a sample set in the orthogonal space using the Mahalanobis distance, but there are other calculation methods for dissimilarities.
The simplest one can be the correlation method, where we calculate the correlation between every sample (spectrum) of a sample set and all the rest (spectra), for example of a training set, but we can calculate, as well, the correlation of every sample of the test set vs. the samples of the training set, or even of a new unknown sample spectrum vs all the spectra of the training set. This way we can define a threshold and select samples over a certain correlation value to do something special with them (as for example a quantitative model).
The vignette show the code to calculate the dissimilarity matrix for the training set:
cd_tr <- dissimilarity(Xr = training$spc_p, diss_method = "cor")dim(cd_tr$dissimilarity)
cd_tr$dissimilarity
cd_mw <- dissimilarity(Xr = training$spc_p,