7 oct 2014

Adding Category Variables to a Data frame in R

Normally I used data frames to manage NIR data, the data frames are composed normally in my case by a X  or Spectra matrix (dataframe$X), and a Y or constituent matrix (dataframe$Y). But when we want to manage and understand plots, like score plots, it is interesting to classify the samples with some category variables.
This category variables can be: "location", "type", "customer", "product",.....
In the case of the shoot-out data the samples can be classify by their content of the main parameter, and can be classified as:
"Low"             (if the sample has less than 160 mg)
"Medium"          (between 160 y 221 mg)
"High"            (more than 221 mg)

Let´s create the variable in the data frame of the training set for instrument 1
nir.training1$type[Y <=160] <- "Low"
nir.training1$type[Y>160 & Y<221] <- "Medium"
nir.training1$type[Y>=221] <- "High"

Now we have a new variable in the data frame called "type"
Check it with:

names(dataframe)

and appart from X and Y we have Type.

We proceed the same way for the other dataframes.

Another thing is that we can create a big data frame with all the spectra from different instruments and sets and create a category variable for the instrument ( A and B), and another for the Set (Training, Test and Validation).

No hay comentarios:

Publicar un comentario