23 may 2022

Using Random Forest models in Soil Near Infrared Analysis (part 2)

In this post I continue from I have left in "Using Random Forest models in Soil Near Infrared Analysis (part 1)", where we developed a Random Forest Model for Carbonate (CaCO3) in soil. Once we have the models and predictions for all the folds, we can plot the actual versus predicted for each fold:



The cross validation can help us to see if we have possible outliers, looking to the plots.

Can we improve the Model?  Yes, why not! Just try to use a batch process to tune for the best hyper-parameters (in this case for the "mtry" argument). Let´s develop a batch sequence from 2 to 30:

cv_tune <- cv_data %>%
crossing(mtry = 2:30)

Developing the models we get these RMSE mean values:
The smallest value is for mtry = 28, and after that one stars increasing.

No hay comentarios:

Publicar un comentario