Skip to main content

The Central Limit Theorem

Concept

Terminology

The Central Limit Theorem (CLT): The probability distribution of the sum or mean of a large random sample drawn with replacement will be roughly normal, regardless of the distribution of the population from which the sample is drawn.

Characteristics of the Distribution of the Sample Means

  • Shape: The CLT says that the distribution of the sample mean is roughly normal, no matter what the population looks like.
  • Center: This distribution is centered at the population mean.
Mean of Distribution of Possible Sample Means=Population MeanSample Mean\begin{align*} \text{Mean of Distribution of Possible Sample Means} &= \text{Population Mean} \\ &\approx \text{Sample Mean} \end{align*}
  • Spread:

    • The distribution's standard deviation will be described by the square root law:
    SD of Distribution of Possible Sample Means=Population SDsample sizeSample SDsample size\begin{align*} \text{SD of Distribution of Possible Sample Means} &= \frac{\text{Population SD}}{\sqrt{\text{sample size}}} \\ &\approx \frac{\text{Sample SD}}{\sqrt{\text{sample size}}} \end{align*}
    • A 95% CLT-based confidence interval for the population mean is given by
    [sample mean2sample SDsample size,sample mean+2sample SDsample size]\left[\text{sample mean} - 2\cdot \frac{\text{sample SD}}{\sqrt{\text{sample size}}}, \text{sample mean} + 2\cdot \frac{\text{sample SD}}{\sqrt{\text{sample size}}} \right]
note

We often use the sample mean and SD instead of the population mean and SD, since we have this information for a sample, but not the population.

The diagram below provides an overview of the Central Limit Theorem, although it references a different dataset. For additional helpful visual guides, please visit the Diagrams site.