9 ene. 2019

Correcting the skewness with logs

It is recommended to look to the histograms to check if the distributions of the predictors, variables or constituents are skewed in some way. I use in this case a predictor of the segmentation original data from the library "Applied Predictive Modeling". where we can find many predictor to check if the cell are well or poor segmented.
If you want to check the paper for this work you can see this link:
 
One of the predictors for this work is VarIntenChn3, and we can check the histogram:
hist(segData$VarIntenCh3)
skewness(segData$VarIntenCh3)
              [1] 2.391624
As we can see the histogram is skewed to the right, so we can apply a transformation to the data to remove the skewness. There are several transformations, and this time we check applying Logs.
 
VarIntenCh3_log<-log(segData$VarIntenCh3)
hist(VarIntenCh3_log)
skewness(VarIntenCh3_log)    
               [1] -0.4037864
 
As we can see the histogram looks more to a Normal distribution, but a little bit skewed to the left.
 



No hay comentarios:

Publicar un comentario