hcv {sm}R Documentation

Cross-validatory choice of smoothing parameter


This function uses the technique of cross-validation to select a smoothing parameter suitable for constructing a density estimate or nonparametric regression curve in one or two dimensions.


hcv(x, y = NA, hstart = NA, hend = NA, ...) 



a vector, or two-column matrix of data. If y is missing these are observations to be used in the construction of a density estimate. If y is present, these are the covariate values for a nonparametric regression.


a vector of response values for nonparametric regression.


the smallest value of the grid points to be used in an initial grid search for the value of the smoothing parameter.


the largest value of the grid points to be used in an initial grid search for the value of the smoothing parameter.


other optional parameters are passed to the sm.options function, through a mechanism which limits their effect only to this call of the function. Those specifically relevant for this function are the following: h.weights, ngrid, display, add; see the documentation of sm.options for their description.


See Sections 2.4 and 4.5 of the reference below.

The two-dimensional case uses a smoothing parameter derived from a single value, scaled by the standard deviation of each component.

This function does not employ a sophisticated algorithm and some adjustment of the search parameters may be required for different sets of data. An initial estimate of the value of h which minimises the cross-validatory criterion is located from a grid search using values which are equally spaced on a log scale between hstart and hend. A quadratic approximation is then used to refine this initial estimate.


the value of the smoothing parameter which minimises the cross-validation criterion over the selected grid.

Side Effects

If the minimising value is located at the end of the grid of search positions, or if some values of the cross-validatory criterion cannot be evaluated, then a warning message is printed. In these circumstances altering the values of hstart and hend may improve performance.


As from version 2.1 of the package, a similar effect can be obtained with the new function h.select, via h.select(x, method="cv"). Users are encouraged to adopt this route, since hcv might be not accessible directly in future releases of the package. When the sample size is large hcv uses the raw data while h.select(x, method="cv") uses binning. The latter is likely to produce a more stable choice for h.


Bowman, A.W. and Azzalini, A. (1997). Applied Smoothing Techniques for Data Analysis: the Kernel Approach with S-Plus Illustrations. Oxford University Press, Oxford.

See Also

h.select, hsj, hnorm


#  Density estimation

x <- rnorm(50)
h.cv <- hcv(x, display="lines", ngrid=32)
sm.density(x, h=hcv(x))

#  Nonparametric regression

x <- seq(0, 1, length = 50)
y <- rnorm(50, sin(2 * pi * x), 0.2)
h.cv <- hcv(x, y, display="lines", ngrid=32)
sm.regression(x, y, h=hcv(x, y))

[Package sm version 2.2-5.6 Index]