Normally I used data frames to manage NIR data, the data frames are composed normally in my case by a X or Spectra matrix (dataframe$X), and a Y or constituent matrix (dataframe$Y). But when we want to manage and understand plots, like score plots, it is interesting to classify the samples with some category variables.
This category variables can be: "location", "type", "customer", "product",.....
In the case of the shoot-out data the samples can be classify by their content of the main parameter, and can be classified as:"Low" (if the sample has less than 160 mg)
"Medium" (between 160 y 221 mg)
"High" (more than 221 mg)
Let´s create the variable in the data frame of the training set for instrument 1
nir.training1$type[Y <=160] <- "Low"
nir.training1$type[Y>160 & Y<221] <- "Medium"
nir.training1$type[Y>=221] <- "High"
Now we have a new variable in the data frame called "type"
Check it with:
and appart from X and Y we have Type.
We proceed the same way for the other dataframes.
Another thing is that we can create a big data frame with all the spectra from different instruments and sets and create a category variable for the instrument ( A and B), and another for the Set (Training, Test and Validation).