1 ene. 2012

NIPALS: Principal Components Analysis with "R" (Part: 002)

We started some posts based on the tutorials of:
"Multivariate Statistical Analysis using the R package chemometrics"
The first post was:
Now we continue with a second part.
The graphics help us to decide the number of PCs, but for the tutorial we decided 5 PCs.
So, let´s calculate the PC space:
> X.pca<-princomp(X)
The  function pcaDiagplot gave to us an interesting plot which help to us to detect outliers, let´s apply this function in R:
> res<-pcaDiagplot(X,X.pca,a=5)
We get this plot:
In this  plots we see two distances: orthogonal and score distances.
Orthogonal distance: Distance between the object (in the original space) and its orthogonal projection on the PCA subspace.
Score distance: Distance of and object projected on the PCA space to the center.

Some chemometric software has the NIPALS Algorithm included  in order to reduce all our original X matrix to a few principal components.
NIPALS Algorithm (nonlinear iterative partial least square algorithm) was developed by H. Wold (1966).
The idea is to substract the reconstruction matrix by the first PC1 to the original matrix (X) getting a residual matrix (E1). From this "E" matrix, we calculate the second principal component PC2.
We have a new reconstruction matrix (the sum of PC1 and PC2) and we substract it again from X, getting a smaller residual matrix (E2), and we continue again with more PCs untill the desired number of PCs or untill "E" becomes very small.
A graphics plot of the variance explained versus number of PCs will help us to decide the cuttoff.
In Principal Components Analysis with "R" (Part: 001), we decided looking at this plot 5 PCs. So lets apply in R this number of components to the NIPALS Algorithm.

> X_nipals<-nipals(X,a=5)
We get some warnings like:
WARNING! Iteration stop in h= 2  without convergence!
In the NIPALS Algorithm, there is one more argument called iteractions (stepwise calculation), that in the case of  "nipals(X,a=5)" lives the default value (it=10), and the tolerance limit too (tol=0,0001).
Let´s try with:
> X_nipals<-nipals(X,a=5,it=160)
No warnings in this case.
> X_nipals<-list(scores=X_nipals$T,loadings=X_nipals$P,sdev=apply(X_nipals$T,2,sd))
> res<-pcaDiagplot(X,X.pca=X_nipals,a=5)
We get this plot:

Graphics, for:
> res<-pcaDiagplot(X,X.pca,a=5)
are almost the same and I not notice any diference.

We will continue soon.

No hay comentarios:

Publicar un comentario