Sampling Distribution of the Sample Mean
Scope Label
Core 9758. This branch develops the distribution of the sample mean , including exact normal cases, the central limit theorem, and standardising with the standard error.
Use it with the hub Sampling and Estimation and the normal-distribution topic Special Continuous Random Variables.
From One Sample Mean to a Sampling Distribution
One observed sample gives one value of .
Before the sample is taken, however, many possible random samples could occur. Each sample may give a different sample mean. Therefore the sample mean has its own distribution.
The notation is:
- is one randomly chosen value from the population;
- is the random variable representing the mean of a random sample;
- is one observed value of after sampling.
This distinction is essential. A question about is about one item. A question about is about an average of items.
Caption: Repeated random samples produce different values of ; these possible values form the sampling distribution of .
Mean and Variance of
The standard H2 formulas below assume a proper random sample whose observations can be treated as independent and identically distributed. This is automatic for a sample from an infinite population, and it is also the usual model for sampling from a finite population with replacement.
For finite sampling without replacement, the exact variance can need a finite-population adjustment. H2 questions using the formula below normally signal the standard random-sample model, so do not add an extra correction unless the question explicitly requires it.
For a random sample of size from a population with mean and variance ,
and
Hence
This standard deviation is called the standard error of the sample mean.
Interpret the two results separately:
- says the sample mean is centred correctly;
- says sample means become less variable as increases.
Caption: As sample size increases, the distribution of the sample mean stays centred at but becomes less spread out.
Distribution of : The Exact Normal Case
If the population itself is normal, then the sample mean is exactly normal.
If
then for a random sample of size ,
This is exact for any sample size .
For example, if
and , then
The variance of is , so the standard deviation of is .
Distribution of : The Central Limit Theorem
If the population is not necessarily normal, the sample mean may still be approximately normal when the sample size is large.
If the population has mean and variance , then for large ,
This is the central limit theorem.
In H2 work, the usual guide is
The important distinction is:
- itself may be skewed, discrete, or non-normal;
- may still be approximately normal when is large.
Caption: The central limit theorem connects many population shapes to an approximately normal distribution for the sample mean when is large.
Choosing the Distribution of
Use this decision process:
- Is the population normal?
- If yes, use the exact result .
- If no, is large enough?
- If yes, use the approximation .
- If no, the usual normal model for is not justified from the given information.
Caption: To choose the distribution of , first ask whether the population is normal; if not, check whether the sample size is large enough for the central limit theorem.
Standardising
Once the distribution of is known or approximated, probability statements can be standardised.
For one observation,
For a sample mean,
The denominator is the standard error, not the population standard deviation. This is the most common procedural error in this branch.
Caption: Standardising one observation uses ; standardising a sample mean uses .
Worked Example 1: Exact Normal Distribution of
The diameter of a metal rod is normally distributed with mean cm and standard deviation cm. A random sample of rods is selected.
Find
Let be the diameter of one rod. Then
Since the population is normal,
The standard deviation of is
Therefore
Hence
Worked Example 2: Central Limit Theorem
Suppose
A random sample of size is taken. Find approximately
First find the population mean and variance of :
and
Since is large, by the central limit theorem,
The standard deviation of is
Therefore
Since
we get
No continuity correction is needed here because the central limit theorem is being applied to the sample mean , not directly to a discrete count.
Worked Example 3: Symmetric Interval for
Suppose has mean and standard deviation . A random sample of size is taken, and the central limit theorem is applicable.
Find
The sample mean has approximate distribution
The standard error is
Therefore
So
Link to Hypothesis Testing
Hypothesis testing uses this branch directly. A test for a population mean assumes a value of under , uses the distribution of under that assumption, and checks whether the observed is unusually extreme.
So the sampling distribution is not just another probability calculation. It is the foundation of inference about a population mean.
Common Pitfalls
- Treating and as the same object.
- Using instead of when standardising .
- Forgetting that the exact normal result requires the population to be normal.
- Applying the central limit theorem when is small and no normal population is given.
- Thinking the central limit theorem makes normal. It applies to .
- Adding a continuity correction when the question is about the sample mean rather than a discrete count.
Revision Checklist
- Can you explain why is random before sampling?
- Can you state and interpret ?
- Can you state and interpret ?
- Can you identify the standard error ?
- Can you decide whether the distribution of is exact normal or approximate normal?
- Can you standardise probability statements involving correctly?
- Can you explain how this branch prepares for hypothesis testing?