We have focussed so far on 95% confidence intervals, which is the confidence level that is used most commonly. The general form of an approximate \(C\%\) confidence interval for a population proportion is Show
where the value of \(z\) is appropriate for the confidence level. For a 95% confidence interval, we use \(z=1.96\), while for a 90% confidence interval, for example, we use \(z=1.64\). In general, for a \(C\%\) confidence interval, we need to find the value of \(z\) that satisfies \[ \Pr(-z < Z < z) = \dfrac{C}{100}, \quad\text{where } Z \stackrel{\mathrm{d}}{=} \mathrm{N}(0,1). \]figure 21 shows the required value of \(z\) as a function of the confidence level. Figure 21: The relationship between the confidence level and the value of \(z\) in the formula for an approximate confidence interval. The following figure is a repeat of figure 13. It shows confidence intervals based on the same estimated proportion, but with different confidence levels. The larger confidence levels lead to wider confidence intervals. Figure 22: Confidence intervals from the same data, but with different confidence levels. The distance from the sample estimate \(\hat{p}\) to the endpoints of the confidence interval is \[ E = z \sqrt{\dfrac{\hat{p}(1-\hat{p})}{n}}. \]The quantity \(E\) is referred to as the margin of error. The margin of error is half the width of the confidence interval. Sometimes confidence intervals are reported as \(\hat{p} \pm E\); this means the bounds of the interval are not directly stated, but must be calculated. We have seen in figure 22 that the margin of error is larger when the confidence level is larger. This is because the value of \(z\) from the standard Normal distribution will be larger when the confidence level is larger. Example: Mobile-phone use among childrenContinuing with the mobile-phone example, consider the 12–14 age group. Calculate an approximate 90% confidence interval for the true proportion of mobile-phone owners in this group. SolutionFrom the table in the initial mobile-phone example, we have \(\hat{p} = \dfrac{918}{1250} = 0.734\). For a 90% confidence interval, we use \(z=1.64\), and so the margin of error is \[ 1.64 \sqrt{\dfrac{0.734(1-0.734)}{1250}} = 0.0205. \]Hence, the 90% confidence interval is \(0.734 \pm 0.0205\), or \((0.714, 0.755)\). In percentage terms, the confidence interval is \(71.4\%\) to \(75.5\%\). Exercise 6Consider Casey's sample of Venus bars from exercise 5. He obtained a random sample of 180 wrappers, and found that 20 were winners. Rather than a 95% confidence interval for the true proportion of winning wrappers, consider a 99% confidence interval.
Maximum margin of errorIn the module Binomial distribution, we noted that if \(X \stackrel{\mathrm{d}}{=} \mathrm{Bi}(n,p)\), then the variance is largest (for a given value of \(n\)) when \(p = \dfrac{1}{2}\), in which case \(\mathrm{var}(X) = n \times \dfrac{1}{2} \times \dfrac{1}{2} = \dfrac{n}{4}\). This has the direct consequence that, when estimating \(p\), the variance of \(\hat{P}\) is largest when \(p=0.5\), and is equal to \(\dfrac{0.25}{n}\). This is perhaps slightly unfortunate for political polling in particular, since such surveys are quite often estimating a characteristic (such as political preference) which is present in about half of the population. However, there is some good news for the pollsters: While they may be in the realm of least precise inferences, they know how bad it can get. For a random sample of size \(n\), the standard deviation of \(\hat{P}\) cannot be bigger than \(\dfrac{0.5}{\sqrt{n}}\), and hence the margin of error for a 95% confidence interval is at most \(1.96\,\dfrac{0.5}{\sqrt{n}} = \dfrac{0.98}{\sqrt{n}}\). To make the reporting of such polls succinct, this fact is sometimes exploited. The report simply uses the maximum margin of error for the given sample size \(n\), knowing that this is conservative: the precision will be as claimed if the estimated proportion \(\hat{p}\) is 0.5 (the percentage is 50%), and better than claimed otherwise. Exercise 7A Nielsen Poll published on 17 February 2013 reported that, in a two-party vote, 56% of voters prefer the Coalition (44% prefer ALP). The report indicates that the approximate margin of error is at most \(2.6\%\).
When to use the Normal approximationA guideline for when to use the Normal approximation for a confidence interval for \(p\) was given in the previous section: both \(x\) and \(n-x\) should be greater than 10. These conditions are generally met for the illustrative data in figure 23, based on \(n=100\). In contrast, the confidence intervals shown in figure 24 are based on data when \(n=10\); in no case is both \(x\) and \(n-x\) greater than 10. In four of the ten cases presented, the confidence intervals are nonsense: either the lower or the upper bound is outside the range 0 to 1. Of course, proportions must be in the range 0 to 1. This shows that these intervals are wrong, and it indicates that we should not trust the approximation when \(n\) is this small. Figure 23: Approximate 95% confidence intervals for various estimates \(\hat{p}\), with \(n=100\). Detailed description Figure 24: Approximate 95% confidence intervals for various estimates \(\hat{p}\), with \(n=10\). You may wonder whether it is possible to find a confidence interval for \(p\) when \(n\) is small, by avoiding the Normal approximation. The answer is that there is a method that uses only the binomial distribution, and does not approximate. It is beyond the scope of the curriculum. Exercise 8This exercise asks you to estimate \(\pi\), using a statistical approach. Suppose you were aware that the area of a circle is \(A = kr^2\), where \(r\) is the radius of the circle and \(k\) is a constant. Suppose you also knew that the equation for a circle centred at the origin is \(x^2 + y^2 = r^2\). So, you knew a lot about circles… but not the value of \(\pi\). What are the conditions for a confidence interval for proportions?There are three conditions we need to satisfy before we make a one-sample z-interval to estimate a population proportion. We need to satisfy the random, normal, and independence conditions for these confidence intervals to be valid.
Which three elements are necessary for calculating a confidence interval?A confidence interval has three elements. First there is the interval itself, something like (123, 456). Second is the confidence level, something like 95%. Third there is the parameter being estimated, something like the population mean, μ or the population proportion, p.
What are the conditions to construct a 95% confidence interval for the proportion?The 95% confidence interval for a proportion is: This formula is appropriate whenever there are at least 5 subjects with the outcome and at least 5 without the outcome. You should always use Z scores (not t-scores) to compute the confidence interval for a proportion.
What assumptions are required for a confidence interval for a single proportion?Here are the six assumptions you should check when constructing a confidence interval:. Assumption #1: Random Sampling. ... . Assumption #2: Independence. ... . Assumption #3: Large Sample. ... . Assumption #4: The 10% Condition. ... . Assumption #5: The Success / Failure Condition. ... . Assumption #6: Homogeneity of Variances.. |