Dear All,

I'm experiencing some doubt about the logic consistency of the
procedure I'm following to analyse my data.

The dataset I'm dealing with, comprises 360 sampling points; for each
one of this points I have a measure of nitrate concentration in
groundwater.

What I'm trying to do is to assess the groundwater vulnerability to
nitrate contamination.

The first step has been to interpolate the contamination data from the
point observations.

My idea was to approach the problem using regression kriging (given the
availabilty of numerous candidate covariates, such as: land use,
rainfall, clay content in soils, etc.).

However, after a preliminary look at the data I've found that my data
is bimodal. This was not a big concern as the residuals of linear
regression are still normal, although strongly heteroskedastic (but
this won't affect the estimates of the coefficients, if I recall
correctly my statistics courses).

Anyway, after a second look at the data I've tried to separate the two
"populations" considering them as a mixture of two normal
distributions, the outcome of this clustering shows that each one of
these "populations" have a different spatial distribution, as one tends
to occupy the northern part of my study area and the other the
southern. This difference is not due to anyone of the covariates
available, or any factor I can think of...

So the next step I tried was to treat these areas as two different
strata fitting a mixed model instead of the simple linear one. This has
effectively increased the accuracy of the prediction, because the two
groups seem to have different coefficients beside a different intercept.

However what puzzles me is the logic of the procedure. Using the
outcome of a process as an input in the estimation of the same outcome
sounds like circular logic to me... is this procedure correct?

I know that similar problems are addressed by latent class analysis,
but I couldn't find anything about this kind of problems in spatial
statistics.

If anybody have dealt with similar problems (or knows of some
literature about), I'll gladly hear any suggestion.

Thanks in advance

Cristiano Ballabio

