In this post I continue from I have left in "Using Random Forest models in Soil Near Infrared Analysis (part 1)", where we developed a Random Forest Model for Carbonate (CaCO3) in soil. Once we have the models and predictions for all the folds, we can plot the actual versus predicted for each fold:
The cross validation can help us to see if we have possible outliers,
looking to the plots.
Can we improve the Model? Yes, why not! Just try to use a batch process to tune for the best hyper-parameters (in this case for the "mtry" argument). Let´s develop a batch sequence from 2 to 30:
cv_tune <- cv_data %>%crossing(mtry = 2:30)
Developing the models we get these RMSE mean values:
The smallest value is for mtry = 28, and after that one stars increasing.
No hay comentarios:
Publicar un comentario