This chapter continues our study of estimating population parameters from random samples. In we studied estimators that assign a number to each possible random sample, and the uncertainty of such estimators, measured by their RMSE. (The RMSE is the square-root of the expected value of the squared difference between the estimator and the parameter—a measure of the typical size of the error.) Instead of assigning a single number to each sample and reporting the size of a typical error, the methods in this chapter assign an interval to each sample and report the confidence level that the interval contains the parameter. Confidence is a technical term related to probability. Just as the RMSE of an estimator measures the long-run average size of the error in repeated sampling, but the error for any particular sample could be smaller or larger than the RMSE, the confidence level is the long-run fraction of intervals that contain the parameter in repeated sampling, but the interval for any particular sample might or might not contain the parameter.
The statement "the interval [92%, 94%] contains the population percentage at confidence level 90%" does not mean that the probability that the population percentage is between 92% and 94% is 90%. (The event that the interval [92%, 94%] contains the population percentage is not random: Either the population percentage is between 92% and 94%, or it is not.) Rather, the statement means that if we were to take samples of size n repeatedly and compute a 90% confidence level confidence interval for the population percentage from each sample of size n, the long-run fraction of intervals that contain the population percentage would converge to 90%.
The length of the confidence interval and the confidence level measure how accurately we are able to estimate the parameter from a sample. If a short interval has high confidence, the data allow us to estimate the parameter accurately. Higher confidence generally requires a longer interval, ceteris paribus, and, shorter intervals generally have lower confidence levels. Conventional values for the confidence level of confidence intervals include 68%, 90%, 95%, and 99%, but sometimes other values are used. It is crucial to know the confidence level associated with a confidence interval: The interval by itself is meaningless.
In this section, we develop conservative confidence intervals for the population percentage based on the sample percentage, using Chebychev’s Inequality and an upper bound on the SD of lists that contain only the numbers 0 and 1. Conservative means that the chance that the procedure produces an interval that contains the population percentage is at least large as claimed. (Later in this chapter we will consider approximate confidence intervals.)
Consider a 0-1 box of N tickets. The population percentage p is the fraction of tickets labeled "1:"
p = 100% × (# tickets in the population labeled "1")/N,
The population percentage is also the population mean of the numbers on all the tickets in the box, ave(box). The sample percentage φ of a simple random sample (random sample without replacement) of size n from the population of N tickets is
φ = 100% × (# tickets in the sample labeled "1")/n.
The sample percentage is the sample mean of the labels on the tickets in the sample. The expected value of the sample percentage φ is the population percentage p, and the SE of the sample percentage φ is
SE(φ) = f × ( p×(1−p) )^{½}/n^{½}
≤ f ×50%/n^{½},
where f is the finite population correction
f = (N −n)^{½}/(N − 1)^{½}.
Thus f ×50%/n^{½} is an upper bound on the SE of the sample percentage.
shows what happens if we center an interval at the sample percentage, and extend the interval down and up from the sample percentage by twice the upper bound on the SE of the sample percentage. When the interval includes the population percentage, we say the interval covers the truth. The interval is random, because it is centered at the sample percentage, which is random. The chance that the random interval will contain the true population percentage is called the coverage probability of the interval. Take a few samples by clicking Take Sample to get the feel of the tool; then increase Samples to Take to 1000 and click Take Sample again. The actual percentage of intervals that cover will vary, but almost always it will be larger than 75%, sometimes nearly 100%. The empirical percentage of intervals that cover is an estimate of the coverage probability of the procedure. Vary the sample size and put a few different lists of zeros and ones into the Population box at the right of the figure, and try a few different sample sizes for each population. You should find that the fraction of intervals that cover the true population percentage stays above 75% (almost without fail), no matter what the population of zeros and ones is.
Why do these random intervals cover the true population percentage so often? We can show that they should using Chebychev's inequality. Because
SE(φ) ≤ f × 50%/n^{½},
the event
| φ − p | ≤ k × SE(φ)
is a subset of the event
| φ − p | ≤ k × f × 50%/n^{½}.
It follows that
P( | φ − p | ≤ k × SE(φ) ) ≤ P( | φ − p | ≤ k × f × 50%/n^{½} ).
Chebychev's inequality guarantees that the chance the sample percentage φ differs from its expected value p by more than k times its standard error is at most 1/k^{2}, so
1 − 1/k^{2} ≤ P ( |φ − p| ≤ k×SE(φ) )
≤ P( |φ − p| ≤ k × f × 50%/n^{½} ).
That is,
P( |φ − p| ≤ k × f × 50%/n^{½} ) ≥ 1 − 1/k^{2}.
Therefore, in the long run in repeated sampling, the fraction of trials in which the sample percentage φ is within ±2×f×50%/n^{½} of the population percentage p converges to a number that is 75% or larger. Whenever φ is within ±2×f×50%/n^{½} of the population percentage p, an interval centered at φ extending down and up by ±2×f×50%/n^{½} will contain p. That is, the interval
φ ± 2× f × 50%/n^{½},
which is shorthand for
[ φ − 2 × f × 50%/n^{½}, φ + 2 × f × 50%/n^{½} ],
contains p at least 75% of the time, in the long run. Similarly, the fraction of trials in which φ is within ±3×f×50%/n^{½} of p converges to a number that is 88.89% or larger, so the long-run fraction of intervals φ±3×f×50%/n^{½} that contain p will be 88.89% or larger. The fraction of trials in which φ is within ±4×f×50%/n^{½} of p converges to a number that is 93.75% or larger, so the long-run fraction of intervals φ±4×f×50%/n^{½} that contain p will be 93.75% or larger, etc.
In general, if we go down and up from the sample percentage by k×f×50%/n^{½}, then in the long run in repeated trials, the resulting intervals will include the true population percentage at least 1 - 1/k^{2} of the time.
Change the Intervals: ± value in to 3 and to 4 to confirm empirically that this is true.
The interval φ±k×f×(50%/n^{½}) is random: Its center depends on φ, which in turn depends on which units (here, tickets) happen to be in the random sample. The probability is in the random sampling procedure, not in the parameter. The parameter is the same, no matter what sample we happen to get—the parameter is a property of the population, not the sample. It is the interval that varies with the random sample. Before the data are collected, the coverage probability is the chance that sampling will result in an interval that contains the parameter.
Taking the sample determines the interval, leaving nothing to chance: The interval the procedure produced either does or does not contain the population percentage. (One could say that after collecting the data, the chance that the interval covers the parameter is either 0 or 100%.) Typically, we never learn whether the interval covers the parameter, but our ignorance is not a probability (at least, not according to the frequency theory of probability used in this book).
The interval the procedure gives for any particular set of data is called a confidence interval. The confidence level of a confidence interval is equal to the coverage probability of the procedure before the data are collected.
Confidence is a word statisticians reserve for this idea. If, before collecting the data, the procedure we are using has a P% chance of producing an interval that covers the true population percentage, then, after collecting the data, the interval the procedure produced is called a P% confidence interval.
Coverage Probability and Confidence Level
Consider a population parameter, and a procedure that produces random intervals. Suppose that the probability that the procedure produces an interval that contains the parameter is P%.
In repeated sampling, about P% of confidence intervals with confidence level P% will contain (cover) the parameter. About (100−P)% of the intervals will not cover the parameter. For any particular sample, unless the population parameter is known, we will not know whether the confidence interval covers the parameter.
summarized the uncertainty of an estimate of a parameter by the mean squared error or root mean squared errorof the estimator, which are measures of the average error of the estimator in repeated sampling. A confidence interval is a different way of expressing the uncertainty in an estimate: a range of values that contains the parameter with specified confidence level.
The interpretation of confidence level for a particular interval is analogous to the interpretation of RMSE for a particular value of the estimate: The RMSE is the square-root of the long-run average squared error of the estimator in repeated sampling, but for any particular sample, the error could be larger or smaller than the RMSE—and we will not know which unless we know the true value of the parameter. The confidence level measures the long-run fraction of intervals that contain the parameter in repeated sampling, but for any particular sample, the confidence interval either will or will not contain the parameter—and we will not know which unless we know the true value of the parameter.
We can use the approach developed in this section to construct confidence intervals for the population percentage p with other nominal confidence levels, by extending the interval up and down from the sample percentage φ by larger or smaller amounts. The longer the intervals, the larger the nominal confidence level—the larger the chance that an interval will contain p. The shorter the intervals, the smaller the chance that an interval will contain p. In particular, if we choose k so that
1 − 1/k^{2} = P%,
then the interval
[ φ − k × f× 50%/n^{½}, φ + k × f× 50%/n^{½} ]
is a (nominal) P% confidence interval for the population percentage p.
Conversely, to get a nominal P% conservative confidence interval for the population percentage using a simple random sample, we should take an interval that extends down and up from the sample percentage by k × f × 50%/n^{½}, with
k = (1 − P/100)^{-½}.
The actual coverage probability of the interval
[ φ − k × f × 50%/n^{½}, φ + k × f × 50%/n^{½} ]
is greater than (1 − 1/k^{2}), for two reasons. First, the standard error of the sample percentage φ is less than f×(50%/n^{½}) unless the population percentage p is 50%. Second, the distribution of the sample percentage is that of an hypergeometric random variable divided by the sample size, n, and such a distribution cannot attain the bound in Chebychev's inequality: Even for the true SE of the sample percentage,
SE(φ) = f × ( p × (1−p) )^{½}/n^{½},
the chance that the sample percentage is within kSE(φ) of the population percentage p is greater than 1−1/k^{2}:
P( | φ − p | < k × SE(φ) ) > 1 − 1/k^{2}.
As a result, confidence intervals for the population percentage based on Chebychev's inequality and the upper bound of 50% for the SD of a list of zeros and ones are conservative: the actual confidence level is greater than the nominal confidence level, (1 − 1/k^{2}). The next section develops a procedure that is not conservative, but that is approximate: The confidence level could be larger or smaller than the nominal level. (The nominal confidence level is close to the actual confidence level when the sample size n is large.)
A population percentage cannot be less than 0%. If the lower endpoint of a confidence interval for a population percentage is negative, it is completely legitimate to replace the lower endpoint by zero: It does not decrease the confidence level. Similarly, a population percentage cannot be greater than 100%. If the upper endpoint of a confidence interval for a population percentage is greater than 100%, it is legitimate to replace the upper endpoint by 100%. The confidence level remains the same. Similarly, if we are constructing a confidence interval for a quantity that cannot be negative (height, weight, or age, for instance), removing negative values from a confidence interval cannot reduce the coverage probability or confidence level.
Confidence Intervals for Restricted Parameters
If some values of a parameter are known to be impossible, excluding those values from a confidence interval does not reduce the confidence level of the confidence interval.
Conversely, including impossible values of a parameter in a confidence interval does not increase the confidence level.
For example, if a confidence interval for a parameter that must be positive has a lower endpoint that is negative, the lower endpoint can be replaced with zero. The confidence level remains the same.
In particular, if the lower endpoint of a confidence interval for a population percentage is negative, the lower endpoint can be replaced with zero. If the upper endpoint of a confidence interval for a population percentage is greater than 100%, the upper endpoint can be replaced with 100%.
Whenever you use a confidence interval, it crucial to report the confidence level. Otherwise, it is impossible to interpret the result. The choice of the confidence level is essentially arbitrary, but the choice should be made before collecting the data. Common values of the confidence level are 68%, 90%, 95%, and 99%. There is a tradeoff between precision (the length of the confidence interval), and confidence level: Ceteris paribus, higher confidence levels require longer confidence intervals.
The following exercise checks your ability to compute a conservative confidence interval for the population percentage.
Recall that percentages are just means of special lists of numbers, lists that contains only zeros and ones. We can find confidence intervals for the means of more general lists of numbers, too.
In the previous section we exploited the fact that the SD of a 0-1 box is at most 1/2 to construct conservative confidence interval for the population mean of a 0-1 box—that is, the population percentage. The approach can be used not only for 0-1 boxes, but whenever we can find a bound on the SD of the box, so that we can apply Chebychev's inequality. For any box of numbered tickets whatsoever, the sample mean of a simple random sample or random sample with replacement is an unbiased estimator of the population mean of the numbers on the tickets, and the SE of the sample mean is proportional to the SD of the box.
For instance, suppose we know that the numbers on the tickets in the box are all between a and b, with a ≤ b. Then SD(box) is at most (b − a)/2. In the special case that a = 0 and b = 1, this implies that the SD of a 0-1 box is at most 50%, as we have seen already.
That in turn implies that the means that if all the numbers in a box are between a and b, the SE of the sample mean of a simple random sample of n draws from the box is at most f(b − a)/(2n^{½}), where f is the finite population correction. And the SE of the sample mean of n draws with replacement from the box is at most (b − a)/(2n^{½}).
Sampling from a Bounded Box
Suppose all the numbers in a box are between a and b, with a ≤ b. Then:
With a bound on the SE, we can use Chebychev's inequality the same way we did for the population percentage to get a confidence interval for the population mean of the numbers on the tickets in a box:
Conservative Confidence Intervals for the Population Mean of a Bounded List
Suppose all the numbers in a box are between a and b, where a ≤ b.
For a simple random sample of size n, the chance that the random interval
[(sample mean)−k×f(b − a)/(2n^{½}), (sample mean) + k×f(b − a)/(2n^{½})]
includes the mean of the numbers in the box is at least 1−1/k^{2}, where f is the finite population correction (N−n)^{½}/(N−1)^{½}, N is the population size, and n is the sample size.
For random sampling with replacement, the chance that the random interval
[(sample mean)−k×(b − a)/(2n^{½}), (sample mean)+k×(b − a)/(2n^{½})]
includes the mean of the numbers in the box is at least 1−1/k^{2}.
In both cases, if the lower endpoint of the interval is less than a, it can be replaced by a, and if the upper endpoint of the interval is greater than b, it can be replaced by b.
These are conservative procedures for constructing confidence intervals: the probability that the intervals they produce cover the true population mean is greater than the probability they claim, 1−1/k^{2} (the nominal coverage probability).
Confidence intervals for the population percentage based on Chebychev's inequality and the upper bound of 50% for the SD of lists of zeros and ones are conservative: Their true confidence level is greater than their nominal confidence level, (1 − 1/k^{2}). We could use shorter intervals and still have confidence level (1 − 1/k^{2}), or we could claim a confidence level higher than (1 − 1/k^{2}).
How much shorter could the interval be, or how large a confidence level could we claim? It is possible to figure these things out precisely, but we shall follow a standard approximate approach instead, one that we can extend to other situations. We shall use the central limit theorem to develop a procedure that produces shorter confidence intervals for a given nominal confidence level. The new procedure will be approximate instead of conservative: the coverage probability will be close to the nominal coverage probability when the sample size is large, but could be smaller or larger depending on the population percentage, and could be quite different from the nominal coverage probability for small samples from pathological populations.
We shall assume throughout the rest of this chapter that either
With this assumption, we can neglect the finite population correction and act as if the tickets in the sample were drawn independently. (See When the tickets are drawn independently, the central limit theorem tells us that as the sample size grows, the normal curve is a better and better approximation to the probability histogram of the sample percentage (and to the probability histogram of the sample mean). The normal approximation to the probability that the sample percentage is in the interval
[p − 1.15×(p×(1−p))^{½}/n^{½}, p + 1.15×(p×(1−p))^{½}/n^{½}]
is equal to the area under the normal curve for the corresponding range of values in standard units, [−1.15, 1.15]. The area under the normal curve between −1.15 and 1.15 is about 75%:
This is much larger than the bound of (1 − 1/(1.15)^{2}) = 24.4% that Chebychev's inequality gives. When the sample percentage φ is within
±1.15× ( p×(1−p) )^{½}/n^{½}
of p, p is within
of the sample percentage φ, so the probability that the interval
I = [ φ − 1.15 × ( p×(1−p) )^{½}/n^{½}, φ + 1.15 × ( p×(1−p) )^{½}/n^{½} ]
contains the population percentage p is about 75%: The coverage probability of I is approximately 75%.
Unfortunately, we cannot construct I from the sample alone: the sample determines the center of I, but to find the length of I we need to know p×(1−p), which is tantamount to knowing p. If we knew p, we would not be estimating it.
If the sample size n is large, the sample standard deviation s
s = ( (n/(n−1)) × φ × (1 − φ) )^{½},
is likely to be close to the SD of the population; when that happens,
s/n^{½}
is close to SE(φ), the standard error of the sample percentage. Therefore, if the sample size is large, but either the sample is small compared to the population or the sample is taken with replacement, the probability that the random interval
[ φ − 1.15 × s/n^{½}, φ + 1.15 × s/n^{½} ]
contains the population percentage p is about 75%. This interval has not only a random center (the sample percentage φ), but also a random length (the length depends on the observed value of s, and s is random, because it depends on the random sample).
Figure lets you try the procedure yourself. Each time you click the Take Sample button, a sample is drawn with replacement from the numbers in the box on the right (initially set to a random list of zeros and ones). The sample size initially is set to 30. The controls at the bottom of the figure allow you to change the size of each sample, the number of samples that are taken each time you click the button, and the width of the interval, as a multiple of the estimated SE or the conservative bound on the SE. (The estimated SE is s/n^{½} because we are sampling with replacement; the bound is 0.5/n^{½}.) A label in the bottom right corner reports the fraction of intervals that cover the population percentage. Intervals that cover are green; those that do not cover are red. A small black dot marks the middle of each interval (the sample percentage). A blue vertical line marks the true population percentage p.
Take a few samples to get the feel of the tool; then increase the Samples to take to 1000, and click the Take Sample button again. The actual percentage of intervals that cover will vary, but should be reasonably close to 75%. Increase Sample size to 200 and try again; the percentage of intervals that cover should be closer to 75%. Try putting a few different lists of zeros and ones into the Population box at the right of the figure, and try a few different sample sizes for each population. When the sample size is large, the fraction of intervals that cover the true population percentage will be very close to 75%.
The following exercises check your ability to compute conservative and approximate confidence intervals for the population percentage, and your ability to determine which method is more appropriate.
(Reminder: Examples and exercises may vary when the page is reloaded; the video shows only one version.)
Suppose that we seek a confidence interval for the mean of a population (box) of numbers, based on a random sample from the population. The sample mean is an unbiased estimator of the population mean (E(sample mean) = ave(box)), so it is reasonable to center a confidence interval at the sample mean. How wide should we make an interval centered at the sample mean, for the interval to have a specified probability of covering the population mean?
If we knew the SD of the population or had an upper bound on the SD of the population, we could use Chebychev's inequality to construct a conservative confidence interval for the population mean, as we did earlier in the chapter: the standard error of the sample mean is
SE(sample mean) = SD(box)/n^{½},
where n is the sample size. So, for example, the coverage probability of the random interval
[(sample mean) − 2×SD(box)/n^{½}, (sample mean) + 2×SD(box)/n^{½}]
is at least 75%.
Typically, however, the SD of the population is not known, so we cannot construct this interval. Moreover, typically we cannot use the conservative approach based on Chebychev's Inequality, because there is no upper bound on the SD of a general list of numbers analogous to the upper bound of 50% for the SD of lists that contain only zeros and ones. (As we have seen, if all the numbers are bounded between a and b, with a≤b, then SD(box)≤(b − a)/2—but typically we do not know such lower and upper bounds a and b.)
However, the approximate approach to constructing confidence intervals, based on the normal curve, works if the sample size is sufficiently large. The central limit theorem tells us that the probability histogram of the average of n draws with replacement from a box follows the normal curve increasingly well as the number of draws n increases. We also know that the sample standard deviation s is increasingly likely to be an accurate estimate of the SD of the population as n increases. As a result, the probability that the sample mean is within ±z×s/n^{½} is approximately the same as the area under the normal curve between −z and z. For any fixed population (box), the approximation improves as the sample size n increases, for random sampling with replacement. Example illustrates calculating an approximate confidence interval for the population mean. The example is dynamic: It will tend to change when you reload the page.
The following exercise checks your ability to calculate approximate confidence intervals for the population mean. The exercise is dynamic: The question will tend to change when you reload the page.
We have seen two methods for constructing confidence intervals for a population percentage: a conservative method based on Chebychev's Inequality and a bound on SD(box), and an approximate method based on the normal approximation. Conservative means that the coverage probability is at least as high as claimed—but could be substantially higher for some populations. Approximate means that the coverage probability is roughly as high as claimed—but could be substantially lower (or substantially higher) for some populations. This section develops a third method, which is exact. Exact means that the probability that the random interval covers the true population percentage is just what it is claimed to be (depending on the value of α it can be a bit higher, simply because the binomial distribution is a discrete distribution).
These intervals are rather different from the confidence intervals presented earlier in this chapter, which were of the form (estimate ± uncertainty). Instead, each of the endpoints is computed from the data, separately. The resulting interval usually is not symmetric around the sample percentage φ.
We assume that a sample of size n is drawn at random with replacement from a 0-1 box. We want to find a confidence interval for p, the percentage of tickets labeled "1" in the box. Let X be the number of tickets in the sample that are labeled "1." If the true percentage of tickets labeled "1" in the 0-1 box is p, then X has a Binomial probability distribution with parameters n and p. We will construct a confidence interval for p by looking at the values of p that are plausible, given the observed value of X. The approach is similar to the approach we took in and very closely related to hypothesis testing, discussed in
Suppose the observed value of X is x. If p were very very small (close to zero), it would be unlikely to see x or more ones in the sample—unless x = 0. So seeing x ones in the sample is evidence that p is not too small. Conversely, if p were very very large (close to one), it would be unlikely to see x or fewer ones in the sample—unless x = n. So observing that X = x limits the plausible range of values of p.
Suppose we want a confidence interval for p with confidence level 1−α. Let p^{-} be the smallest value of q for which
α/2 ≤ P(X ≥ x if p = q) = _{n}C_{x}q^{x}(1−q)^{n−x} + _{n}C_{x+1}q^{x+1}(1−q)^{n−x−1} + … + _{n}C_{n}q^{n}(1−q)^{0}.
Similarly, let p^{+} be the largest value of q for which
α/2 ≤ P(X ≤ x if p = q) = _{n}C_{x}q^{x}(1−q)^{n−x} + _{n}C_{x−1}q^{x+1}(1−q)^{n−x+1} + … + _{n}C_{0}q^{0}(1−q)^{n}.
Then the interval [p^{-}, p^{+}] is a 1−α confidence interval for p. Intervals constructed this way can be much shorter than the conservative intervals based on Chebychev's Inequality and the upper bound on SD(box), but they are still guaranteed to attain at least their nominal confidence level. Confidence intervals based on the normal approximation are generally not much shorter, but their actual confidence level can be substantially lower than their nominal confidence level.
We can also use a random sample with replacement to find a confidence interval for a percentile of a population. We shall work out the details for the median; other percentiles can be treated similarly. Unlike the conservative and approximate confidence intervals—and like exact confidence intervals for the population percentage we just saw—and these intervals are not of the form (estimate ± uncertainty). Instead, the endpoints of the intervals are two of the data. And this approach also leads to exact confidence intervals: The nominal coverage probability is equal to the actual coverage probability.
To begin, suppose we have a random sample of size 10
{X_{1}, X_{2}, … , X_{10}}
taken with replacement from a population with median m. Sort the data into increasing order: let X_{(1)} be the smallest datum, X_{(2)} be the second-smallest, etc., and let X_{(10)} be the largest datum. (The sorted data are called the order statistics.) Let A_{1} be the event that the fourth-smallest datum, X_{(4)}, is less than or equal to the median, and let A_{2} be the event that the seventh-smallest datum, X_{(7)}, is greater than or equal to the median. The event A_{1} occurs unless 7 or more data are greater than the population median, so A_{1}^{c} is the event that 7 or more data are greater than the population median. Similarly, the event A_{2} occurs unless 7 or more data are less than the population median, so A_{2}^{c} is the event that 7 or more data are less than the population median. Let A=A_{1}A_{2} be the event that the fourth and seventh order statistics bracket the median. We shall find a lower bound on the probability of A.
Note that if seven or more data are less than the median, then it is not the case that seven or more data are greater than the median, so A_{1}^{c} and A_{2}^{c} are disjoint. Hence,
P(A^{c}) = P((A_{1}A_{2})^{c})
= P(A_{1}^{c} ∪A_{2}^{c})
= P(A_{1}^{c}) + P(A_{2}^{c}),
and thus
P(A) = 1−P(A^{c}) = 1 − P(A_{1}^{c}) − P(A_{2}^{c}).
We are done if we can find upper bounds for P(A_{1}^{c}) and P(A_{2}^{c}).
Recall that the median is the smallest number that at least 50% of the population are less than or equal to. It follows that the probability that a number drawn at random from the population is strictly less than the median is at most 50% (and possibly less), and that the probability that a number drawn at random from the population is strictly greater than the median is at most 50% (and possibly less). The data are drawn from the population independently, so the number of data that are less than the population median has a Binomial probability distribution with n trials and p ≤ 50%, as does the number of data that are greater than the population median.
Let Y be a random variable with a Binomial distribution with parameters n=10 and p = 50%. Thus P(A_{1}^{c}) ≤ P(Y ≥ 7), and P(A_{2}^{c}) ≤ P(Y ≥ 7). However, P(Y ≥ 7) = P(Y ≤ 3), so
P(A) ≥ 1 − P(Y≤ 3 or Y ≥ 7) = P(4 ≤ Y ≤ 6).
Thus the probability that the interval [X_{(4)}, X_{(7)}] contains the population median is at least as large as the probability of observing 4, 5, or 6 successes in 10 independent trials with probability 50% of succeess in each trial—the highlighted area in :
The interval from the fourth-smallest datum to the seventh-smallest datum is therefore a confidence interval for the population median.
The same idea can be used to find confidence intervals for other percentiles: The probability distribution of the number of data that are less than the 100×qth percentile is Binomial with number of trials equal to the number of data, n, and probability of success at most q, and the probability distribution of the number of data that are greater than the 100×qth percentile is Binomial with number of trials equal to the number of data, n, and probability of success at most 1−q.
The following exercise checks whether you can find a confidence interval for a population median.
Suppose we have a procedure for calculating an interval from every possible sample of size n from a population of size N (a box of N numbered tickets). Let t be a parameter of the population. Suppose that if the procedure is applied to a random sample of size n, the chance that the resulting interval will contain t is P%. Then the interval that results from applying the procedure to any particular random sample of size n is a P% confidence interval for t. Once the random sample has been drawn, the resulting interval either covers (contains) or does not cover t—the probability that the interval covers t is either 0 or 100%. The probability that the interval will cover t before the sample is drawn is called the confidence level of the interval after the sample is drawn. Confidence intervals provide an alternative to reporting a single "best estimate" of a parameter and a summary measure of the uncertainty of the estimate. It is possible to construct conservative confidence intervals for the population percentage from simple random samples or random samples with replacement from 0-1 boxes: For a simple random sample of size n, the chance that the random interval
[φ−k×f/(2n^{½}), φ−k×f/(2n^{½})]
covers the population percentage p is at least 1−1/k^{2}, where φ is the sample percentage, f is the finite population correction (N−n)^{½}/(N−1)^{½}, N is the population size, and n is the sample size. For random sampling with replacement, the chance that the random interval
[φ−k/(2n^{½}), φ−k/(2n^{½})]
includes the population percentage p is at least 1−1/k^{2}. These are conservative procedures for constructing confidence intervals, because the probability that the intervals they produce cover the true population percentage p (the actual coverage probability) is greater than the probability they claim, 1−1/k^{2} (the nominal coverage probability). These procedures can be extremely pessimistic, especially when the sample size n is large and when the true population percentage p is far from 50%—the intervals then are much wider than they need to be for the actual coverage probability to be 1−1/k^{2}.
Suppose that the random sample is drawn with replacement. When the sample size n is large, the central limit theorem ensures that the probability histogram of the sample percentage can be approximated accurately by the normal curve. The expected value of the sample percentage φ is p and the SE of the sample percentage is SD(box)/n^{½}, where SD(box) is the population SD, (p×(1−p))^{½}, the SD of the list of numbers on the tickets in the box. When n is large, the SD of the sample, s^{*}, tends to be an accurate estimate of SD(box), and the chance that the random interval
[φ−z×s^{*}/n^{½}, φ+z×s^{*}/n^{½}]
contains p is approximately equal to the area under the normal curve between z. Taking z=1.96, for example, gives approximate 95% confidence intervals. The coverage probability of this procedure typically is not exactly the area under the normal curve between ±z, but as the sample size grows, the coverage probability approaches that area.
Approximate confidence intervals for the population mean can be constructed similarly, but then it is more common to use
s=s^{*}×n^{½}/(n−1)^{½}
to estimate SD(box) than to use s^{*}. Let M denote the sample mean. For random sampling with replacement, if the sample size n is large, the chance that the random interval
[M−z×s/n^{½}, M+z×s/n^{½}]
covers the population mean is approximately equal to the area under the normal curve between ±z. Again, the coverage probability is not exactly the area under the normal curve between ±z, but it approaches that area as the sample size grows.
Confidence intervals can be constructed for population parameters other than percentages and means. For example, one can construct confidence intervals for percentiles of a population using the fact that for random sampling with replacement, the number of data that are less than the 100×qth percentile has a binomial distribution with parameters n and p = q, and the number of data that are greater than the 100×qth percentile has a binomial distribution with parameters n and p=1−q.