## 29 mar. 2016

### Tutorials with Resemble (Part 4.a)

SVD (Singular Value Decomposition) is the algorithm used in Resemble to
We have seen this method several times in this blog, but this tutorial is
an oportunity to repeat the steps for its calculation:

`pcProj<-orthoProjection(Xr=X_train,X2=NULL,Yr=Y_train,method="pca",`
`        + pcSelection=list("opc",40))  `

`names(pcProj)`
`      "scores"       "X.loadings"   "variance" `
`      "sc.sdv"       "n.components" "pcSelection"  `
`      "center"       "scale"        "method"       `
`      "opcEval"   `
`> pcProj\$variance (for the first 3 principal components)     `
`                 pc1        pc2         pc3  `
`sdv        2.0206860 0.30988115 0.115319110  #Standard deviation of each component `
`cumExplVar 0.9712854 0.99412767 0.997291052  #Cumulative explained variance `
`explVar    0.9712854 0.02284228 0.003163383  #Explained variance`

`We can check how resemble calculate this values using all this script:`
`############### SINGULAR VALUE DESCOMPOSITION ##################`
`Xt_mean<-colMeans(X_train)`
`#We substract the Mean Spectrum to every spectrum of the Training Set.`
`#This is called as "Center"`
`#We can use this function for that:         `
`X_train_c<-scale(X_train,center = TRUE,scale =FALSE)`
`#Let´s calculate the Matrix "d", "u" and "v" with "svd"`
`Xt_svd<-svd(X_train_c)`
`#now we have the three matrices: "d" "u" "v"`
`#In order to save memory R use a diferent convention`
`#for the matrix dimensions q<-min(n,m)`
`Xt_svd_U<-Xt_svd\$u  #Matrix U  (dim= n.q)`
`Xt_svd_d<-Xt_svd\$d  #diagonal elements of D:#square root of eigenvalues`
`Xt_svd_d[1:20]`
`Xt_svd_d<-Xt_svd_d[1:20]`
`36.87405917  5.65480039  2.10437628  1.62599314  0.74187387  0.54914598 `
`0.28188014   0.27391151  0.20568644  0.16270108  0.15283486  0.13692129 `
`0.08754608   0.07632065  0.07013343  0.05292563  0.05069955  0.04231237   `
`0.03851919   0.02561542`
`Xt_svd_d2<-Xt_svd_d^2  #d^2 (explained variance)`
`1.359696e+03 3.197677e+01 4.428400e+00 2.643854e+00 5.503768e-01 `
`3.015613e-01 7.945642e-02 7.502752e-02 4.230691e-02 2.647164e-02 `
`2.335850e-02 1.874744e-02 7.664316e-03 5.824841e-03 4.918698e-03 `
`2.801123e-03 2.570444e-03 1.790337e-03 1.483728e-03 6.561497e-04`
`explVar<-round(Xt_svd_d2/sum(Xt_svd_d2),digits=7)`
`0.9712877 0.0228423 0.0031634 0.0018886 0.0003932 0.0002154 0.0000568 `
`0.0000536 0.0000302 0.0000189 0.0000167 0.0000134 0.0000055 0.0000042 `
`0.0000035 0.0000020 0.0000018 0.0000013 0.0000011 0.0000005`
`Xt_svd_V<-Xt_svd\$v              #Matrix V  (dim= m.q)  `
`Xt_svd_T<-Xt_svd_U %*%Xt_svd_D  #Score Matrix (T)`
`Xt_svd_T<-Xt_svd_T[,1:20]       #dim 334*20`
`sdev<-apply(Xt_svd_T,2,sd) #standard deviation of each component`
`2.020685995 0.309881152 0.115319110 0.089103875 0.040654438 0.030093014 `
`0.015446937 0.015010258 0.011271547 0.008915964 0.008375299 0.007503241 `
`0.004797496 0.004182346 0.003843288 0.002900307 0.002778318 0.002318704 `
`0.002110838 0.001403716`
`Xt_svd_P<-Xt_svd_V               #Loading Matrix (P)`

## 21 mar. 2016

You can get also the Reference Manual.
After you have the ZIP file in your PC, select in R-Studio: Install Packages:
It requires that R be updated to 3.2.2 or higher.
I have updated just now so I will continue the tutorials with this version which solves some bugs from the previous one.

## 19 mar. 2016

### Importing NIRsoil spectra from Resemble into Win ISI

This is the second post where I import data from R packages (in this case the NIRsoil spectra from Resemble package) into a project in Win ISI which is the software I usually use in my job.

First thing to do is to export the training and validation sets in a ".txt" table:
write.table(X_train,"c:/x_train.txt",sep="\t")
write.table(X_val,"c:/x_val.txt",sep="\t")

Now as explained in the post:
How to import a TXT spectra file into Win ISI

We import the table into WinISI with the option CONVERT, and we can work from now with these data sets in Win ISI and Resemble.

Training Set:

Validation Set:

## 6 mar. 2016

### Tutorials with Resemble (Part 3)

“ex1” is a list, and one of the values is “pcAnalysis” which is another list containing the scores of the  training spectra matrix and the scores of the validation spectra matrix. The scores are standardized, so the variance is one in each principal component.

pcAnalysis<-ex1\$pcAnalysis
scores_Xt<-pcAnalysis\$scores_Xr
scores_Xv<-pcAnalysis\$scores_Xu

As we have done in other plots if we have the score matrix we can plot the different planes to check the samples in the PC space select the combinations we prefer, in this plot we represent PC1 vs. PC2:

plot(scores_Xt[,1],scores_Xt[,2],col="blue",
panel.first=grid(),xlim=c(-7,7),
ylim=(c(-7,7)))
par(new=T)
plot(scores_Xv[,1],scores_Xv[,2],col="red",
xlim=c(-7,7),ylim=(c(-7,7)))

and we can draw ellipses of radio 1,  to see more clearly the distance to the centroid.
Training samples are in blue and validation samples are in red.

This is an example of how to build an ellipse of radio 4:

library( plotrix)
draw.ellipse(0,0,4,4,border=1,
angle=0, lty=3)