The statistics of cross-validation residuals - Theory
In a restrained refinement, such as is typical in macromolecular
crystallography, the ratios derived in the previous two sections would
only be applicable if
were calculated from a random
selection of residuals including both structure amplitude observations
and restraints. Since R-factors are traditionally only based on structure
amplitudes, the estimation of
ratios for restrained
refinement requires further analysis.
The number of observations this time is n and f of these are
structure amplitudes, the balance consisting of r geometrical,
thermal or other restraints which make a contribution
to
the minimised residual at convergence. From equation (1)
we have
where the summation is taken over the restraint observations. Using this equation and equation (2) the summation over the structure amplitude observations can be written
This result and equations (4) and (7) give the following
approximation for
Similarly an approximation to
can be derived
from equations (4), (7) and (14).
Hence an estimate of
at the convergence of a
correctly weighted restrained refinement with only random uncorrelated
errors is
The estimated ratio of the free residual to the included residual is given by
and hence
The statistics of cross-validation residuals - Theory