Proportion Test
One sample proportion test calculator Two sample proportion test calculatorThe proportion test compares the sample's proportion to the population's proportion or compares the sample's proportion to the proportion of another sample.
We use this test to check if the known proportion is statistically correct, based on the sample proportion and the sample size.
the null hypothesis assumes that the known proportion is correct. The statistical decision will be based on the difference between the know proportion and the sample proportion.
You may choose between the binomial test, which is more accurate, especially for the small sample size and the normal approximation.
We recommend using only the binomial test. If the tool won't be able to calculate the binomial distribution it will automatically calculate base on the normal approximation. depend on the sample size and how close is x to np. for a sample size smaller than 1000 any combination will be calculate based on the binomial distribution (when choosing the binomial test).
Example: It is known that the proportion of newborn males in the human race is 0.5122. The residence of Brobdingnag claims that in their country the proportion is smaller.
Assumptions
- Binomial distribution - the probability for an event is identical
- The population's proportion, p0, is known.
Required sample data
Calculated based on a random sample from the entire population.
- p̂ - Sample proportion or x number of successes.
If you enter a value between 0 and 1 the tool will assume you entered p̂ , if you fill number bigger than 1, the tool will assume you entered the number of successes x. - n - Sample size
Test statistic
Normal approximationx distribution is binomial.
The binomial mean is μ = np, and the binomial standard deviation is: $$\sigma_x=\sqrt{np(1-p)}$$ The proportion p distributes with a mean of p
0 and the following standard deviation: $$\sigma_p=\sqrt{\frac{p_0(1-p_0)}{n}}$$ Following the normal statistic: $$z=\frac{(\hat{p}-p_0)+c}{\sqrt{\frac{p_0(1-p_0)}{n}}}$$ Since the normal distribution is continuous and the binomial distribution is discrete we may use the continuous correction to improve a bit the result.
$$ p>p_0:\qquad\quad c=-\frac{1}{2n}\\ p\lt p_0:\qquad\quad c=-\frac{1}{2n}\\ |p-p_0|\lt \frac{1}{2n}:\;c=0 $$
Exact test - binomial distributionWhen using the binomial distribution the test statistic is the number of successes: X.
Since the distribution is
discrete there is a big difference between
lower/greater or
lower equal/greater equal (unlike continuous distribution). It is also more complicated to calculate the 2 tailed p-value as the distribution is not symmetrical, and you can't get the exact the same value in the opposite tail.
The following will use the example of
n = 8 and
p = 0.25x | p(X=x) | p(X≤x) | p(X≥x) |
---|
0 | 0.100112915 | 0.100112915 | 1 |
1 | 0.266967773 | 0.367080688 | 0.899887085 |
2 | 0.311462402 | 0.678543091 | 0.632919312 |
3 | 0.207641602 | 0.886184692 | 0.321456909 |
4 | 0.086517334 | 0.972702026 | 0.113815308 |
5 | 0.023071289 | 0.995773315 | 0.027297974 |
6 | 0.003845215 | 0.99961853 | 0.004226685 |
7 | 0.000366211 | 0.999984741 | 0.00038147 |
8 | 1.52588E-05 | 1 | 1.52588E-05 |
When x = np, x equals exactly the mean, x located in the middle of the distribution and p-value equals 1.
Left-tailed$$p-value=p(X\le x)=\sum_{i=0}^{x}\binom{n}{x}p^xq^{n-x}$$ Example: x=1.
Since x < np, x located on the left side of the distribution. (1<8*0.25)
p-value=p(X≤1)=0.367081.
Right-tailed$$p-value=p(X\ge x)=\sum_{i=x}^{n}\binom{n}{x}p^xq^{n-x}$$ Example: x=4.
Since x > np, x located in the right side of the distribution. (4>8*0.25)
p-value=p(X ≥ 4)=1-p(X ≤ (4-1))=0.113815.
Two-tailedFind the tail in one size based on x.
Find the x' on the opposite tail, with the greater density that is less or equal to the density of x.
For example, if x on the left tail: $$p-value=p(X\le x) + p(X\ge x')\\ p-value=\sum_{i=0}^{x}\binom{n}{x}p^xq^{n-x} + \sum_{i=x}^{n}\binom{n}{x}p^xq^{n-x}$$ Example: x=1.
On the left side: p(X=1)=0.266968
On the right side: p(X=3)=0.207642. p(x=2)=0.311462, so x'=3.
p-value = p(X≤1) + p(X ≥ 3) = 0.367081 + 0.321457 = 0.6885376.
Effect size
The tool calculates the h effect size.
$$\varphi(p)=2arcsine(\sqrt{p})\\ h=\varphi(p̂)-\varphi(P_0)$$ Cohen's interpretation for the h effect size:
Small effect - 0.2.
Medium effect - 0.5.
Large effect - 0.8.
We use this test to check if the proportion of group1 is the same as the proportion of group2.
The tool's null hypothesis assumes that the known difference between the groups is zero (using only the pooled variance).
Example: compares the proportion of good oranges between two fields, base on a sample from each group. H0 assumes the proportions are identical.
Assumptions
- Binomial distribution - the probability for an event within each group is identical
Required sample data
Calculated based on a random sample from the entire population
- p̂1, p̂2 - Sample proportions or x1,x2 number of successes.
If you enter a value between 0 and 1 the tool will assume you entered p̂ , if you fill number bigger than 1, the tool will assume you entered the number of successes x. - n1, n2 - Sample sizes
Test statistic
Normal approximationX
1 and X
2 distributions are binomial.
The difference between P
1 and P
2 assumes to distributes with mean equals 0, and Following the normal statistic: $$z=\frac{(\hat{p_1}-\hat{p_2})-0}{\sigma}\\$$ There are two ways to calculates the standard deviation σ based on the null assumption.
Pooled varianceWhen H
0 assumes p
1 - p
2 = 0, since the standard deviation of the binomial distribution is based on
p, this assumption also includes the assumption that the standard deviation is identical for the two samples, hence we should calculate the pooled variance. based on the two samples together. $$ \hat{p}=\frac{x_1+x_2}{n_1+n_2}=\frac{n_1\hat{p}_1+n_2\hat{p}_2}{n_1+n_2}\\ Var_{pooled}=Var_1+Var_2\\ Var_{pooled}=\frac{p_1(1-p_1)}{n_1}+\frac{p_2(1-p_2)}{n_2}\\ $$ Since we assume that p
1=p
2: $$ \hat{p_1}=\hat{p_2}=\hat{p}\\ Var_{pooled}=\frac{\hat{p}(1-\hat{p})}{n_1}+\frac{\hat{p}(1-\hat{p})}{n_2}\\ \sigma_{pooled}=\sqrt{\hat{p}(1-\hat{p})(\frac{1}{n_1}+\frac{1}{n_2})}\\ z_{pooled}=\frac{(\hat{p_1}-\hat{p_2})+c}{\sqrt{\hat{p}(1-\hat{p})(\frac{1}{n_1}+\frac{1}{n_2})}}\\ $$
Unpooled varianceThe tool doesn't calculate the unpooled variance.
When H
0 assumes p
1 - p
2 = d, since the standard deviation of the binomial distribution is based on
p, this assumption also includes the assumption that the standard deviation is
not identical for the two samples, so we need to calculate the accumulate variance of two
independent random variables: $$ Var_{unpooled}=Var_1+Var_2\\ \sigma_{unpooled}=\sqrt{\frac{\hat{p_1}(1-\hat{p_1})}{n_1}+\frac{\hat{p_2}(1-\hat{p_2})}{n_2}}\\ z_{unpooled}=\frac{(\hat{p_1}-\hat{p_2})+c}{\sqrt{\frac{\hat{p_1}(1-\hat{p_1})}{n_1}+\frac{\hat{p_2}(1-\hat{p_2})}{n_2}}}\\ $$ Since the normal distribution is continuous and the binomial distribution is discrete we may use the continuous correction to improve a bit the result.
$$ Right-tailed\;or\;Two\;tailed:\; c=\frac{1}{2n_1}+\frac{1}{2n_2}\\ Left-tailed\;or\;Two\;tailed:\; c=-\frac{1}{2n_1}+\frac{1}{2n_2}\\ $$
Effect size
The tool calculates the h effect size.
$$\varphi(p)=2arcsine(\sqrt{p})\\ h=\varphi(p̂_1)-\varphi(p̂_2)$$ Cohen's interpretation for the h effect size:
Small effect - 0.2.
Medium effect - 0.5.
Large effect - 0.8.