# Proportion Test

One sample proportion test calculator Two sample proportion test calculator

The proportion test compares the sample's proportion to the population's proportion or compares the sample's proportion to the proportion of another sample.

## One sample proportion test (Go to the calculator)

We use this test to check if the known proportion is statistically correct, based on the sample proportion and the sample size.
the null hypothesis assumes that the known proportion is correct. The statistical decision will be based on the difference between the know proportion and the sample proportion.

You may choose between the binomial test, which is more accurate, especially for the small sample size and the normal approximation.
We recommend using only the binomial test. If the tool won't be able to calculate the binomial distribution it will automatically calculate base on the normal approximation. depend on the sample size and how close is x to np. for a sample size smaller than 1000 any combination will be calculate based on the binomial distribution (when choosing the binomial test).

Example: It is known that the proportion of newborn males in the human race is 0.5122. The residence of Brobdingnag claims that in their country the proportion is smaller.

### Assumptions

• Binomial distribution - the probability for an event is identical
• The population's proportion, p0, is known.

### Required sample data

Calculated based on a random sample from the entire population.

• - Sample proportion or x number of successes.
If you enter a value between 0 and 1 the tool will assume you entered , if you fill number bigger than 1, the tool will assume you entered the number of successes x.
• n - Sample size

### Test statistic

Normal approximation
x distribution is binomial.
The binomial mean is μ = np, and the binomial standard deviation is: $$\sigma_x=\sqrt{np(1-p)}$$ The proportion p distributes with a mean of p0 and the following standard deviation: $$\sigma_p=\sqrt{\frac{p_0(1-p_0)}{n}}$$ Following the normal statistic: $$z=\frac{(\hat{p}-p_0)+c}{\sqrt{\frac{p_0(1-p_0)}{n}}}$$ Since the normal distribution is continuous and the binomial distribution is discrete we may use the continuous correction to improve a bit the result.
$$p>p_0:\qquad\quad c=-\frac{1}{2n}\\ p\lt p_0:\qquad\quad c=-\frac{1}{2n}\\ |p-p_0|\lt \frac{1}{2n}:\;c=0$$ Exact test - binomial distribution
When using the binomial distribution the test statistic is the number of successes: X.

Since the distribution is discrete there is a big difference between lower/greater or lower equal/greater equal (unlike continuous distribution). It is also more complicated to calculate the 2 tailed p-value as the distribution is not symmetrical, and you can't get the exact the same value in the opposite tail.
The following will use the example of n = 8 and p = 0.25

xp(X=x)p(X≤x)p(X≥x)
00.1001129150.1001129151
10.2669677730.3670806880.899887085
20.3114624020.6785430910.632919312
30.2076416020.8861846920.321456909
40.0865173340.9727020260.113815308
50.0230712890.9957733150.027297974
60.0038452150.999618530.004226685
70.0003662110.9999847410.00038147
81.52588E-0511.52588E-05
When x = np, x equals exactly the mean, x located in the middle of the distribution and p-value equals 1.
Left-tailed
$$p-value=p(X\le x)=\sum_{i=0}^{x}\binom{n}{x}p^xq^{n-x}$$ Example: x=1.
Since x < np, x located on the left side of the distribution. (1<8*0.25)
p-value=p(X≤1)=0.367081. Right-tailed
$$p-value=p(X\ge x)=\sum_{i=x}^{n}\binom{n}{x}p^xq^{n-x}$$ Example: x=4.
Since x > np, x located in the right side of the distribution. (4>8*0.25)
p-value=p(X ≥ 4)=1-p(X ≤ (4-1))=0.113815. Two-tailed
Find the tail in one size based on x.
Find the x' on the opposite tail, with the greater density that is less or equal to the density of x.
For example, if x on the left tail: $$p-value=p(X\le x) + p(X\ge x')\\ p-value=\sum_{i=0}^{x}\binom{n}{x}p^xq^{n-x} + \sum_{i=x}^{n}\binom{n}{x}p^xq^{n-x}$$ Example: x=1.
On the left side: p(X=1)=0.266968
On the right side: p(X=3)=0.207642. p(x=2)=0.311462, so x'=3.
p-value = p(X≤1) + p(X ≥ 3) = 0.367081 + 0.321457 = 0.6885376. ### Effect size

The tool calculates the h effect size.
$$\varphi(p)=2arcsine(\sqrt{p})\\ h=\varphi(p̂)-\varphi(P_0)$$ Cohen's interpretation for the h effect size:
Small effect - 0.2.
Medium effect - 0.5.
Large effect - 0.8.

## Two Sample proportion test (Go to the calculator)

We use this test to check if the proportion of group1 is the same as the proportion of group2.
The tool's null hypothesis assumes that the known difference between the groups is zero (using only the pooled variance).
Example: compares the proportion of good oranges between two fields, base on a sample from each group. H0 assumes the proportions are identical.

### Assumptions

• Binomial distribution - the probability for an event within each group is identical

### Required sample data

Calculated based on a random sample from the entire population

• 1, 2 - Sample proportions or x1,x2 number of successes.
If you enter a value between 0 and 1 the tool will assume you entered , if you fill number bigger than 1, the tool will assume you entered the number of successes x.
• n1, n2 - Sample sizes

### Test statistic

Normal approximation
X1 and X2 distributions are binomial.
The difference between P1 and P2 assumes to distributes with mean equals 0, and Following the normal statistic: $$z=\frac{(\hat{p_1}-\hat{p_2})-0}{\sigma}\\$$ There are two ways to calculates the standard deviation σ based on the null assumption.

Pooled variance
When H0 assumes p1 - p2 = 0, since the standard deviation of the binomial distribution is based on p, this assumption also includes the assumption that the standard deviation is identical for the two samples, hence we should calculate the pooled variance. based on the two samples together. $$\hat{p}=\frac{x_1+x_2}{n_1+n_2}=\frac{n_1\hat{p}_1+n_2\hat{p}_2}{n_1+n_2}\\ Var_{pooled}=Var_1+Var_2\\ Var_{pooled}=\frac{p_1(1-p_1)}{n_1}+\frac{p_2(1-p_2)}{n_2}\\$$ Since we assume that p1=p2: $$\hat{p_1}=\hat{p_2}=\hat{p}\\ Var_{pooled}=\frac{\hat{p}(1-\hat{p})}{n_1}+\frac{\hat{p}(1-\hat{p})}{n_2}\\ \sigma_{pooled}=\sqrt{\hat{p}(1-\hat{p})(\frac{1}{n_1}+\frac{1}{n_2})}\\ z_{pooled}=\frac{(\hat{p_1}-\hat{p_2})+c}{\sqrt{\hat{p}(1-\hat{p})(\frac{1}{n_1}+\frac{1}{n_2})}}\\$$ Unpooled variance
The tool doesn't calculate the unpooled variance.
When H0 assumes p1 - p2 = d, since the standard deviation of the binomial distribution is based on p, this assumption also includes the assumption that the standard deviation is not identical for the two samples, so we need to calculate the accumulate variance of two independent random variables: $$Var_{unpooled}=Var_1+Var_2\\ \sigma_{unpooled}=\sqrt{\frac{\hat{p_1}(1-\hat{p_1})}{n_1}+\frac{\hat{p_2}(1-\hat{p_2})}{n_2}}\\ z_{unpooled}=\frac{(\hat{p_1}-\hat{p_2})+c}{\sqrt{\frac{\hat{p_1}(1-\hat{p_1})}{n_1}+\frac{\hat{p_2}(1-\hat{p_2})}{n_2}}}\\$$ Since the normal distribution is continuous and the binomial distribution is discrete we may use the continuous correction to improve a bit the result.
$$Right-tailed\;or\;Two\;tailed:\; c=\frac{1}{2n_1}+\frac{1}{2n_2}\\ Left-tailed\;or\;Two\;tailed:\; c=-\frac{1}{2n_1}+\frac{1}{2n_2}\\$$

### Effect size

The tool calculates the h effect size.
$$\varphi(p)=2arcsine(\sqrt{p})\\ h=\varphi(p̂_1)-\varphi(p̂_2)$$ Cohen's interpretation for the h effect size:
Small effect - 0.2.
Medium effect - 0.5.
Large effect - 0.8.