I am taking a course on statistics and I see this:
I don't get why there need to be 10 successes and 10 failures in the sample. What's the intuition behind why this needs to be a condition for the sampling distribution of sample proportions to be normal. Is it because if the probability of success was say 95% and n was 100, it's likely that sometimes, samples will produce a number of failures == 0? If that's the case, the probability of that sample will be 100%? Why wouldn't the sampling distribution be normally distributed around 95%?
1 Answer
$\begingroup$I am not sure if the following explanation is the origin motivation for requiring that $np\geq 10$ and $n(1-p)\geq 10$.
For a normal distribution, 99.7% of data should be within 3 standard deviations away from the center.
The proportion of a random sample must be between 0 and 1. Suppose that the population proportion is neither $0$ or nor $1$.
A normal distribution with the mean $\mu=\hat{p}$ and the standard deviation $\sigma=\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$ is a good approximation to the distribution of sample proportions with the size $n$ if the interval (which consists of 99.7% of samples)$$
\left[\hat{p}-3\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}, \hat{p}+3\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\right]
$$is contained in $(0, 1)$.
Then the above requirement is equivalent to that
\begin{equation} n\hat{p}>9(1-\hat{p}),\tag{1} \end{equation}\begin{equation} n(1-\hat{p})>9\hat{p}. \tag{2}\end{equation}Given the fact that $\hat{p}$ can be close to 0 or 1. If we assume that $n\hat{p}\geq 10$ and $n(1-\hat{p})\geq 10$. Then both inequalities (1) and (2) are satisfied.