The hist
function uses the Sturges method by default to determine the number of breaks on the histogram. This selection is very important because too many bins will increase the variability and few bins will group the data too much.
breaks
argument
The breaks
argument controls the number of bars, cells or bins of the histogram. By default breaks = "Sturges"
.
Sturges method (default)
The default method is the most recommended in the most of the cases.
# Sample data
set.seed(2)
x <- rnorm(2000)
# Histogram
hist(x,
main = "Sturges")
Too many bins
If you specify the number of breaks manually make sure the number is not too high.
# Sample data
set.seed(2)
x <- rnorm(2000)
# Histogram
hist(x, breaks = 80,
main = "Too many bins")
Not enough bins
The number of bins can also be too small in some cases.
# Sample data
set.seed(2)
x <- rnorm(2000)
# Histogram
hist(x, breaks = 5,
main = "Not enough bins")
Scott method
In addition to the Sturges method the breaks
argument also supports the Scott method.
# Sample data
set.seed(2)
x <- rnorm(2000)
# Histogram
hist(x, breaks = "Scott",
main = "Scott")
Freedman-Diaconis (FD) method
The Freedman-Diaconis algorithm can be selected passing “Freedman-Diaconis” or “FD” to the argument.
# Sample data
set.seed(2)
x <- rnorm(2000)
# Histogram
hist(x, breaks = "Freedman-Diaconis",
main = "Freedman-Diaconis")
hist(x, breaks = "FD", # Equivalent
main = "Freedman-Diaconis")
You can also pass a vector giving the number of breakpoints or a function to compute the number of bins or breakpoints.
An alternative to the Sturges method and selecting the breaks argument by hand is using the plug-in method to calculate the optimal bandwidth (Wand, 1995). This method is implemented in KernSmooth
and you can use it as follows.
# Sample data
set.seed(2)
x <- rnorm(2000)
# install.packages("KernSmooth")
library(KernSmooth)
# Optimal bandwidth
bin_width <- dpih(x)
# Number of bins
nbins <- seq(min(x) - bin_width,
max(x) + bin_width,
by = bin_width)
# Histogram
hist(x, breaks = nbins,
main = "Plug-in method")
See also