23 mar 2015

Mahalanobis in the PC space (removing redundant) - 2

We have seeing this plot in previous posts, but it is a good occasion to see them again if yo have read the previous post. The function Moutlier from the package chemometrics, have the option to see the outliers,, in the PC space with the normal covariance matrix and the robust covariance matrix. A line is defined for the cutoff value for a certain chi-square distribution, and the samples out are the outliers. Anyway in this example I am considering just two principal components, but when using more PCs, we give just one value for all of them, and it means that maybe a sample is an outlier for one of the PCs, but for the general computation this sample is fine.
A more conservative approach is to put a bigger distance for the cutoff, or a warning an an action cutoff.
In this plot we see the distance respect to the firts two PCs, and we see the same two outliers than in the previous post (four in the case of the Robust Covariance Matrix).
 
But if we consider the PC2 and PC3·, there are not outliers, and all the samples are bellow the cutoff:
This are sample very with certain physical properties, which make them sensitive to be outliers in the first PC, but not in the rest of the PC score maps.
If somebody wants the file to follow the tutorial, just let me know by mail and I will send it to you. The script is on my Github page
 
The chemometric package is the R companion to the book "Introduction to Multivariate Statistical Analysis in Chemometrics" written by K. Varmuza and P. Filzmoser (2009)
 

No hay comentarios:

Publicar un comentario