## 13 feb. 2015

### Selecting samples for lab analysis (Part 1)

Laboratory analysis for reference methods is expensive, and it will help a function which select the spectra well distributed all along the wavelength space, in the way that we get a flat distribution.
In the ”prospectr”  package there are several functions which select the spectra based in their distribution in the PC space. One of these functions is the “Kennard-Stone” algorithm.
We can measure the distance in Mahalanobis or Euclidian distance.

An example can be that we have spectra of 1000 samples, but we can only afford to pay money to the laboratory for 50 or 100, so we can write in the function this number, and we get an output “model” with the selected samples.

ken_mahal<-kenStone(X=X,k=20,metric="mahal",pc=3)
plot(ken_mahal\$pc[,1],ken_mahal\$pc[,2],
+ xlab="PC1",ylab="PC2")
points(ken_mahal\$pc[ken_mahal\$model,1],
+ ken_mahal\$pc[ken_mahal\$model,2],pch=19,col=2)
plot(ken_mahal\$pc[,1],ken_mahal\$pc[,3],
xlab="PC1",ylab="PC3")
points(ken_mahal\$pc[ken_mahal\$model,1],
+ ken_mahal\$pc[ken_mahal\$model,3],pch=19,col=2)
plot(ken_mahal\$pc[,2],ken_mahal\$pc[,3],
+xlab="PC2",ylab="PC3")
points(ken_mahal\$pc[ken_mahal\$model,2],
+ ken_mahal\$pc[ken_mahal\$model,3],pch=19,col=2)

In these plot 20 samples selected and how well distributed are in the PC space