R & Chemometrics: Hierarchical Cluster Analysis (ChemoSpec)

19 jul 2012

Hierarchical Cluster Analysis (ChemoSpec) - 02

This is the second derivative spectra of the raw spectra we have sawn in the post: "Hierarchical Cluster Analysis (ChemoSpec) - 01". In that post we saw some clusters, but the distance between the clusters was not high, so it was clear that some math treatment should be applied to remove baseline shifts and to increase the differences between the clusters as much as possible.

Well, let’s see now the HCA in this case:

Well, now it looks much better, Olive samples in one cluster, and sunflower oil samples in another. We can see also two sub-cluster in the sunflower samples. Looking to the spectra we can see some reasons for that more clearly now. That will be treated in the next post.

9 comentarios:

Anónimo21 de julio de 2012, 3:16
Could you please explain the math treatment that was done since the results are much much better

excellent work
ResponderEliminar
Respuestas
José Ramón Cuesta23 de julio de 2012, 18:03
I convert the raw spectra to second derivative using a Chemometric software called Win Isi, but you can use others like Unscrambler,...., after that these softwares can export the spectrum to a TXT file son I can import it into R.
These softwares calculate the derivative based in the segment-Gap concept. Gap is always cero and I used a segment of 10. You can go to the labes "derivadas", where I explain this concept, in spanish (I,ll translate these post in a near future), but there are some drawings which can help. Of course could be possible to develop a function in R to do the same.
Probably you will get the same results using the SG filters in R where you can configure them for first,second derivative, third.....
ResponderEliminar
Respuestas
Bryan Hanson26 de julio de 2012, 16:54
In R, you can use the function sgolayfilt() in package signal to get the derivatives. If you are using ChemoSpec, the spectral data is in SpectraObject$data, so you can make a copy of SpectraObject and then replace the $data with the derivatives, then work from there. For the first derivative you would use something like SpectraObject$data <- sgolayfilt(SpectraObject$data, m = 1) but as $data is a matrix, you would have to use a plyr or apply method to "loop" over each row of the matrix.

Thanks JR!
ResponderEliminar
Respuestas
Anónimo9 de agosto de 2012, 10:14
Thank you very much for these answers, It is really intersting
ResponderEliminar
Respuestas
lobogris12 de febrero de 2014, 16:31
José Ramón,
I'm interested on this approach as it looks like some work done for classifying time series of ndvi. The
improvement at using the derivative (BTW, why the 2ond and not the 1st? was there an improvement when
using the 1st?) is very interesting. Are you aware of articles using this approach in your field?
Thanks
Agus
ResponderEliminar
Respuestas

Añadir comentario