These are samples of a type of fat (liquid) meassured in transflectance, to develop a calibration for moisture. In the first figure we can see 3 samples which clearly are of high moisture content and clearly separate from the rest. Anyway the lab value for these samples does not make sense, because it´s lower than many of the other samples in the set.
If we take out this 3 samples the correlation plot improve at 1450 nm from 0,50 to 0,70.
We found similar problems in not such a big scale for some of the other samples, anyway I develop the equation with a simple math treatment. We can get better improvement with more complex ones, or not, validation will decide.
Some outliers (5) have been remove for high residual probably for the reason I was talking about.
We split data into a calibration and a validation set (ramdomly)
Calibration Plot and statistics:
I used a simple Multiple Scatter Correction math treatment without derivatives, so more combinations can be tried.
When this happens, ask to the lab for the values, review the papers,....,trying to get the right values.
It´s important how much time has passed from the acquisition of the spectra and the lab reference analysis.
Do we keep the sample to repeat the reference analysis ad sample acquisition?
Think in more causes,......