The statistics of cross-validation residuals
PostScript
Figure 1. Plot of the
ratio as a function of the ratio for 725
macromolecular structures in the Protein Data Bank, where
is the number of atoms
included in the refinement and f the number of reflections used. The
data points are colour-coded according to their resolution range, as shown
in the key on the bottom right of the graph. The high resolution data
points tend to be close to the vertical axis with the other points tending
to spread further to the right the lower their resolution, as might be
expected. Also shown are four dotted curves corresponding to different
values of the variable a. The curves shown are for: a = 1
which corresponds to 3 parameters per atom (ie restrained
refinement of atomic coordinates only, plus an overall temperature factor);
a = 2 is for 4 parameters per atom (restrained refinement of
coordinates plus individual isotropic temperature factors); a = 4
represents unrestrained refinement with 4 parameters per atom; and a
= 7 represents 9 parameters per atom (restrained anisotropic refinement).
PostScript
Figure 2. Plot of the same data as in Fig.1 but with the vertical
axis representing z which is a linear function of
, where
and
. Here the curves
from Fig.1, which corresponded to different values of a, are now
straight lines. Again, the data points are colour-coded according to
resolution range, as shown in the key on the right. Also shown are 5
coloured lines representing least-squares lines fitted to the data points
in each of these five different resolution ranges. The lines are identified
by the appropriate point-markers in square brackets outside the graph
border.The grey regions show the regions from which the data points have
been excluded in the least-squares lines calculations, namely points
outside the sector bounded by the lines corresponding to a = 10 and
a = 0.5.
The statistics of cross-validation residuals