sm.discontinuity {sm} R Documentation

The detection of discontinuities in a regression curve or surface.

Description

This function uses a comparison of left and right handed nonparametric regression curves to assess the evidence for the presence of one or more discontinuities in a regression curve or surface. A hypothesis test is carried out, under the assumption that the errors in the data are approximately normally distributed. A graphical indication of the locations where the evidence for a discontinuity is strongest is also available.

Usage

```sm.discontinuity(x, y, h, hd, ...)
```

Arguments

 `x` a vector or two-column matrix of covariate values. `y` a vector of responses observed at the covariate locations. `h` a smoothing parameter to be used in the construction of the nonparametric regression estimates. A normal kernel function is used and `h` is its standard deviation(s). However, if this argument is omitted `h` will be selected by an approximate degrees of freedom criterion, controlled by the `df` parameter. See `sm.options` for details. `hd` a smoothing parameter to be used in smoothing the differences of the left and right sided nonparametric regression estimates. A normal kernel function is used and `hd` is its standard deviation(s). However, if this argument is omitted `hd` will be set to `h * sqrt(0.25)`, and `h` reset to `h * sqrt(0.75)`, when `x` is a vector When `x` is a matrix, `hd` will be set to `h * sqrt(0.5)` and `h` will be reset to the same value. `...` other optional parameters are passed to the `sm.options` function, through a mechanism which limits their effect only to this call of the function; those relevant for this function are `add`, `eval.points`, `ngrid`, `se`, `band`, `xlab`, `ylab`, `xlim`, `ylim`, `lty`, `col`; see the documentation of `sm.options` for their description.

Details

The reference below describes the statistical methods used in the function. There are minor differences in somecomputational details of the implementation.

Currently duplicated rows of `x` cause a difficulty in the two covariate case. Duplicated rows should be removed.

Value

a list containing the following items

 `p` the p-value for the test of the null hypothesis that no discontinuities are present. `sigma` the estimated standard deviation of the errors. `eval.points` the evaluation points of the nonparametric regression estimates. When `x` is a matrix, `eval.points` is also a matrix whose columns define the evaluation grid of each margin of the evaluation rectangle. `st.diff` a vector or matrix of standardised differences between the left and right sided estimators at the evaluation points. `diffmat` when `x` is a vector, this contains the locations and standardised differences where the latter are greater than 2.5. `angle` when `x` is a matrix, this contains the estimated angles at which the standardised differences were constructed. `h` the principal smoothing parameter. `hd` the smoothing parameter used for double-smoothing (see the reference below).

Side Effects

a plot on the current graphical device is produced, unless the option `display="none"` is set.

References

Bowman, A.W., Pope, A. and Ismail, B. (2006). Detecting discontinuities in nonparametric regression curves and surfaces. Statistics \& Computing, 16, 377–390.

`sm.regression`, `sm.options`

Examples

```par(mfrow = c(3, 2))

with(nile, {
sm.discontinuity(Year, Volume, hd = 0)
sm.discontinuity(Year, Volume)

ind <- (Year > 1898)
plot(Year, Volume)
h <- h.select(Year, Volume)
sm.regression(Year[!ind], Volume[!ind], h, add = TRUE)
sm.regression(Year[ ind], Volume[ ind], h, add = TRUE)

hvec <- 1:15
p <- numeric(0)
for (h in hvec) {
result <- sm.discontinuity(Year, Volume, h,
display = "none", verbose = 0)
p <- c(p, result\$p)
}
plot(hvec, p, type = "l", ylim = c(0, max(p)), xlab = "h")
lines(range(hvec), c(0.05, 0.05), lty = 2)
})

with(trawl, {
Position  <- cbind(Longitude, Latitude)
ind <- (Longitude < 143.8)
# Remove a repeated point which causes difficulty with sm.discontinuity
ind[54] <- FALSE
sm.regression(Position[ind,], Score1[ind], theta = 35, phi = 30)
sm.discontinuity(Position[ind,], Score1[ind], col = "blue")
})
par(mfrow = c(1, 1))

#  The following example takes longer to run.
#  Alternative values for nside are 32 and 64.
#  Alternative values of yjump are 1 and 0.5.
# nside  <- 16
# yjump  <- 2
# x1     <- seq(0, 1, length = nside)
# x2     <- seq(0, 1, length = nside)
# x      <- expand.grid(x1, x2)
# x      <- cbind(x1 = x[, 1], x2 = x[, 2])
# y      <- rnorm(nside * nside)
# ind    <- (sqrt((x[, 1] - 0.5)^2 + (x[, 2] - 0.5)^2) <= 0.25)
# y[ind] <- y[ind] + yjump
# image(x1, x2, matrix(y, ncol = nside))
# sm.discontinuity(x, y, df = 20, add = TRUE)
```

[Package sm version 2.2-5.6 Index]