single.index {mgcv} R Documentation

## Single index models with mgcv

### Description

Single index models contain smooth terms with arguments that are linear combinations of other covariates. e.g. s(Xa) where a has to be estimated. For identifiability, assume ||a||=1 with positive first element. One simple way to fit such models is to use `gam` to profile out the smooth model coefficients and smoothing parameters, leaving only the a to be estimated by a general purpose optimizer.

Example code is provided below, which can be easily adapted to include multiple single index terms, parametric terms and further smooths. Note the initialization strategy. First estimate a without penalization to get starting values and then do the full fit. Otherwise it is easy to get trapped in a local optimum in which the smooth is linear. An alternative is to initialize using fixed penalization (via the `sp` argument to `gam`).

### Author(s)

Simon N. Wood simon.wood@r-project.org

### Examples

```require(mgcv)

si <- function(theta,y,x,z,opt=TRUE,k=10,fx=FALSE) {
## Fit single index model using gam call, given theta (defines alpha).
## Return ML if opt==TRUE and fitted gam with theta added otherwise.
## Suitable for calling from 'optim' to find optimal theta/alpha.
alpha <- c(1,theta) ## constrained alpha defined using free theta
kk <- sqrt(sum(alpha^2))
alpha <- alpha/kk  ## so now ||alpha||=1
a <- x%*%alpha     ## argument of smooth
b <- gam(y~s(a,fx=fx,k=k)+s(z),family=poisson,method="ML") ## fit model
if (opt) return(b\$gcv.ubre) else {
b\$alpha <- alpha  ## add alpha
J <- outer(alpha,-theta/kk^2) ## compute Jacobian
for (j in 1:length(theta)) J[j+1,j] <- J[j+1,j] + 1/kk
b\$J <- J ## dalpha_i/dtheta_j
return(b)
}
} ## si

## simulate some data from a single index model...

set.seed(1)
f2 <- function(x) 0.2 * x^11 * (10 * (1 - x))^6 + 10 *
(10 * x)^3 * (1 - x)^10
n <- 200;m <- 3
x <- matrix(runif(n*m),n,m) ## the covariates for the single index part
z <- runif(n) ## another covariate
alpha <- c(1,-1,.5); alpha <- alpha/sqrt(sum(alpha^2))
eta <- as.numeric(f2((x%*%alpha+.41)/1.4)+1+z^2*2)/4
mu <- exp(eta)
y <- rpois(n,mu) ## Poi response

## now fit to the simulated data...

th0 <- c(-.8,.4) ## close to truth for speed
## get initial theta, using no penalization...
f0 <- nlm(si,th0,y=y,x=x,z=z,fx=TRUE,k=5)
## now get theta/alpha with smoothing parameter selection...
f1 <- nlm(si,f0\$estimate,y=y,x=x,z=z,hessian=TRUE,k=10)
theta.est <-f1\$estimate

## Alternative using 'optim'...

th0 <- rep(0,m-1)
## get initial theta, using no penalization...
f0 <- optim(th0,si,y=y,x=x,z=z,fx=TRUE,k=5)
## now get theta/alpha with smoothing parameter selection...
f1 <- optim(f0\$par,si,y=y,x=x,z=z,hessian=TRUE,k=10)
theta.est <-f1\$par

## extract and examine fitted model...

b <- si(theta.est,y,x,z,opt=FALSE) ## extract best fit model
plot(b,pages=1)
b
b\$alpha
## get sd for alpha...
Vt <- b\$J%*%solve(f1\$hessian,t(b\$J))
diag(Vt)^.5

```

[Package mgcv version 1.8-28 Index]