binning {sm} R Documentation

## Construct frequency table from raw data

### Description

Given a vector or a matrix `x`, this function constructs a frequency table associated to appropriate intervals covering the range of `x`.

### Usage

```binning(x, y, breaks, nbins)
```

### Arguments

 `x, y` a vector or a matrix with either one or two columns. If `x` is a one-dimentional matrix, this is equivalent to a vector. `breaks` either a vector or a matrix with two columns (depending on the dimension of `x`), assigning the division points of the axis, or the axes in the matrix case. It must not include `Inf`,`-Inf` or `NA`s, and it must span the whole range of the `x` points. If `breaks` is not given, it is computed by dividing the range of `x` into `nbins` intervals for each of the axes. `nbins` the number of intervals on the `x` axis (in the vector case), or a vector of two elements with the number of intervals on each axes of `x` (in the matrix case). If `nbins` is not given, a value is computed as `round(log(length(x))/log(2)+1)` or using a similar expression in the matrix case.

### Details

This function is called automatically (under the default settings) by some of the functions of the `sm` library when the sample size is large, to allow handling of datasets of essentially unlimited size. Specifically, it is used by `sm.density`, `sm.regression`, `sm.ancova`, `sm.binomial` and `sm.poisson`.

### Value

In the vector case, a list is returned containing the following elements: a vector `x` of the midpoints of the bins excluding those with 0 frequecies, its associated matrix `x.freq` of frequencies, the coodinateds of the `midpoints`, the division points, and the complete vector of observed frequencies `freq.table` (including the 0 frequencies), and the vector `breaks` of division points. In the matrix case, the returned value is a list with the following elements: a two-dimensional matrix `x` with the coordinates of the midpoints of the two-dimensional bins excluding those with 0 frequecies, its associated matrix `x.freq` of frequencies, the coordinates of the `midpoints`, the matrix `breaks` of division points, and the observed frequencies `freq.table` in full tabular form.

### References

Bowman, A.W. and Azzalini, A. (1997). Applied Smoothing Techniques for Data Analysis: the Kernel Approach with S-Plus Illustrations. Oxford University Press, Oxford.

`sm`, `sm.density`, `sm.regression`, `sm.binomial`, `sm.poisson`, `cut`, `table`

### Examples

```# example of 1-d use
x  <- rnorm(1000)
xb <- binning(x)
xb <- binning(x, breaks=seq(-4,4,by=0.5))
# example of 2-d use
x <- rnorm(1000)
y <- 2*x + 0.5*rnorm(1000)
x <- cbind(x, y)
xb<- binning(x, nbins=12)
```

[Package sm version 2.2-5.6 Index]