The statistics of cross-validation residuals
First we show that the least-squares refinement using
a weight matrix
which is the inverse of the VCM of the
observations, minimises the VCMs of both the refined parameters
and the free residuals
.
Consider a function
which is a function of the
refined least-squares parameters which is linear to within a first
order Taylor approximation.
Now consider two column matrices
and
, defined
below, which are both unbiased estimates of
.
where
is any weight matrix and
. From
equations (27) and (28) and from
the definitions of
and
We wish to show that the VCM of
is smaller than that of
. Because
and
are unbiased estimators,
and thus the VCM of
can be
expressed as
The last two terms of the above equation are the transpose of each other and are each zero matrices as shown by the following analysis which uses equations (25), (26), (27) and (29).
Hence from equation (30)
Since the VCMs are positive definite
Thus the VCM of
which is calculated with
is less than the VCM of
which is calculated with
another weight matrix
. Making the substitution
and setting
to a unit matrix, this analysis shows
that by using the weight matrix
, we minimise the variance of
. Substituting
and
, the same analysis shows that
also minimises the VCM of
. From equation (18) the VCM of the residuals
associated with the excluded observations
is the sum of the
constant matrix
and the VCM of
. Hence
and its trace are also minimised by choosing
as the
weight matrix.
The trace of
is the expected value of the unweighted sum of
squared residuals
where the summation is taken over the p reflections in the test set. By using the normal approximation in equation (22) we can say that the sum of absolute differences
and hence
are approximately minimised by choosing
as the
least-squares weight matrix.
The statistics of cross-validation residuals