The Mann-Whitney U test, also called the Wilcoxon rank-sum test, is a non-parametric test. It checks continuous or ordinal data for a significant difference between two independent groups. The test merges the data from the two groups. Then, it sorts the data by the value. Unlike the t-test that compares the groups' averages, the rank test compares the **entire distributions**.

When the two groups' distributions have a similar shape, the test will also compare the **median** of each group.

For symmetrical distribution, the **median** is the **average**.

The test usually comes as a failsafe option to a Two-sample T-test or Two-sample Z-test when the test doesn't meet the normality assumption or contains many outliers.

A t-test compares the means of the two groups, while a Mann-Whitney U test compares the ranks. If the two groups have a similar distribution curve, the test will also compare the **medians** of the two groups.

A Two-sample T-test is slightly stronger than a Mann-Whitney U Test. A Mann-Whitney U test has 95% efficiency in comparison to a Two-sample T-test. If the population is similar to a normal distribution and reasonably symmetric, it is better to use a Two-sample T-test. A Two-sample T-test compares the means of the two groups

**Not normal**, the data is not normally distributes and the sample size is less than 30.**Ordinal data**, but not interval scaled. You know the order but not the differences between the values.

for example - unhappy, neutral, happy**Outliers**the test is more robust to outliers than the T-test

Higher rank, R_{i}, for group i (lower U_{i}), says that the probability to get higher value from this group is higher.

Following an example:**R _{B} > R_{C}, μ_{B} < μ_{C}**.

The rank of group

There is a high probability that a random value from group

The median of group

Group | Values | Average | Median | Rank over | ||
---|---|---|---|---|---|---|

A | B | C | ||||

A | 01,02,03,04,20,21,22,23,24 | 13.33 | 20 | 69.5 | 80 | |

B | 14,15,17,19,20,31,32,33,35 | 24.00 | 20 | 101.5 | 108 | |

C | 13,13,13,13,13,13,13,200,300 | 65.67 | 13 | 91 | 63 |

**Independent observations**.**Ordinal / Continuous**- the compared data consist of ordinal data or continuous data.**Similar Shape**- If the test is required to compare the medians, both groups should have a similar shape. Otherwise, the test can compare only the ranks.

- Merge the data from the two groups to one group.
- Sort the data from low value to high value.
- Rank the merged list, the lower value gets rank 1 , the second rank 2, etc.

When having ties group, identical value for several observations, the rank will be the average of the ranks for the entire group. __Calculate the ranks__

R_{1i}- the rank of the i member in group 1.

R_{2i}- the rank of the i member in group 2.

n_{1}the number of observations in group 1.

n_{2}the number of observations in group 2.

$$R_1=\sum_{i=1}^{n_1}{R_{1 i}}$$ $$R_2=\sum_{i=1}^{n_2}{R_{2 i}}$$__Calculate Ui__

$$U_1=n_1n2+\frac{n_1(n_1+1)}{2} - R_1.\\ U_2=n_1n2+\frac{n_2(n_2+1)}{2} - R_2.\\ (U_1+U_2=n_1n_2)$$ Since the distribution is symmetrical, usually U is the minimum between U_{1}and U_{2}. $$U=min(U_1,U_2).$$ It is good for the two tails test, but for the one tail test, it will always assume the following H_{1}: the sample with larger values is bigger than the sample with the smaller values.

In this tool, the statistics is U_{2}, in this way, we can calculate the left-tailed or the right-tailed like any other test.

Calculated the distribution, p(U<u), under the null assumption of equal probability to get higher value from Group_{1} or to get higher value from Group_{2}. It calculates the U for all the combinations:

$$p(U = u) = \frac{Combination\ with\ U=u}{All\ combinations}$$ The method requires high calculation power, the calculation duration grows exponentially to the total sample size (n_{1} + n_{2}).

The calculation is accurate only when the data does not have ties. The tool uses pre-calculated data to save performance time.

To get more accurate results, the tool uses the ties corrections, and optionally use the continuity correction and. ties is a group of observations with the same value. $$ Z = \frac { U_2 - \mu_u + C_{continuity}} {\sigma_u}$$ $$ \mu= \frac {n_1 n_2} {2}$$ $$ \sigma^2= \frac {n_1 n_2(n_1 + n_2 + 1)} {12} (1 - C_{ties})$$

$$n = n_1 + n_2

\\ C_{ties} = \sum_{i=1}^{t}{\frac{f_t^3-f_t}{n^3-n}}$$ t - group number of ties.

f_t - number of values in group t.

When using continuous distribution to calculate discrete data it is better to use continuity correction.

P(X < a) => P(X < a - 0.5)

P(X > a) => P(X > a + 0.5)

As a result:**Right tail**, or two tails with positive Z, (U_{2} > μ) , C_{continuity} = **-**0.5 .**Left tail**, or two tails with negative Z, (U_{2} < μ) , C_{continuity} = 0.5 .

When we don't correct the data, C_{continuity} = 0.

The tool will use the exact method or the normal approximation per the method definition:

**This is the recommended method!**.

When n is small, n1 ≤ 20 and n2 ≤ 20, and **the data doesn't have ties** the tool will use the exact value from the pre-calculated data. Otherwise, the tool will use the normal approximation.

Using the 'Exact' method force the tool to use the exact method even when having ties.

When n is small, n1 ≤ 20 and n2 ≤ 20, the tool will use the exact value from tables. Otherwise, the tool will use the normal approximation.

The tool uses the normal approximation.

The **common language effect size** is the probability that a random value from Group_{1} is greater than random value from Group_{2}.

$$f=\frac{U_1}{n_1n_2}\\ r=\frac{Z}{\sqrt{n_1+n_2}}$$

The following example checks the number of questions answered correctly by two independent groups. One group completed training before performed the test and the other group didn't do the training. Following the test results. The sample sizes: n1=8, n2=10. The significant level (α) is 0.05.

A | B |
---|---|

4 | 23 |

7 | 6 |

8 | 3 |

9 | 24 |

13 | 17 |

13 | 14 |

17 | 24 |

11 | 29 |

13 | |

33 |

I prefer the indirect method which I find easier with big samples, and not much more complex with small samples.

- Merge the lists of the two groups to one list.
- Sort by the value, the smallest value first.

Grey background - tie, repeated value.Simple Rank A Merge B 1 3 1 2 2 4 3 6 3 4 4 7 5 5 8 6 6 9 7 7 11 8 9 13 9 9 13 10 13 9 11 14 11 12 12.5 17 13 17 12.5 14 23 14 15 24 15.5 16 24 15.5 17 29 17 18 33 18 total 54 116.5 - Simple Rank - rank by the value, the lower Absolute value gets rank 1, the second 2, etc.
- Rank - usually will be the same as Simple Rank. When the same value repeats, tie, the rank is the average of the simple ranks

The value 13 repeats 3 times.

$$\frac{8+9+10}{3}=9$$ The value 17 repeats 2 times.

$$\frac{12+13}{2}=12.5$$ The value 24 repeats 2 times.

$$\frac{15+16}{2}=15.5$$ R_{1}= 2 + 4 + 5 + 6 + 7 + 9 + 9 + 12.5 = 54.5

R_{2}= 1 + 3 +9 + 11 +12.5 + 14 + 15.5 + 15.5 + 17 + 18 = 116.5 __Calculate Ui__$$U_1 = n_1n2 + \frac{n_1(n_1+1)}{2} - R_1\\ U_1 = 8*10+\frac{8*(8+1)}{2} - 54.5 = 61.5 \\ U_2=n_1n2+\frac{n_2(n_2+1)}{2} - R_2 \\ U_2 =8*10+\frac{10*(10+1)}{2} - 116.5 = 18.5 \\ (U_1+U_2=61.5 + 18.5 = 80, n_1n_2=8*10 = 80)$$ U = min(61.5 , 18.5) = 18.5

- Merge the lists of the two groups to one list.
- Sort by the value, the smallest value first.
Simple Rank A Merge B 1 3 0 2 1 4 3 6 1 4 2 7 5 2 8 6 2 9 7 2 11 8 2.5 13 9 2.5 13 10 13 6 11 14 7 12 4.5 17 13 17 7.5 14 23 8 15 24 8 16 24 8 17 29 8 18 33 8 total 18.5 61.5 - For each value check how many values from the other group have a smaller value.
- Tie - count 0.5 for each value from the other group with the same value.
- Group a (blue) Rank2 - there is one red(b) value smaller than 4: 3, fill 1 in column a.

Rank4 - there are two red(b) values smaller than 7: 3,6, fill 2 in column a.

Ranks 5,6,7 - the same like Rank4. Ranks 8,9 - there are two red(b) values smaller than 13: 3,6 and one equal red value, 2 + 0.5 = 2.5, fill 2.5 in column a.

Rank12 - there are 4 red(b) values smaller than 17: 3,6,13,14 and one equal red value, 4 + 0.5 = 4.5, fill 4.5 in column a.

U_{1}= 1+2+2+2+2+2.5+2.5+4.5 = 18.5 - Group b (red) Rank1 - there is no any blue(a) value smaller than 3, fill 0 in column b.

Rank3 - there is one blue(a) value smaller than 6: 4, fill 1 in column b.

Rank10 - there are 5 blue(a) values smaller than 13: 4,7,8,9,11 and 2 equal values, 5 + 2 * 0.5 = 6, fill 6 in column b.

Rank11 - there are 7 blue(a) values smaller than 14: 4,7,8,9,11,13,13 fill 7 in column b.

Rank13 - there are 5 blue(a) values smaller than 17: 4,7,8,9,11,13,13 and one equal value, 7 + 0.5 = 7.5, fill 7.5 in column b.

Ranks 14,15,16,17,18 - there are 8 blue(a) values smaller than 14, all the blue group, fill 8 in column b.

U_{2}= 0+1+6+7+7.5+8+8+8+8+8 = 61.5

When the statistic equal the critical value, you should reject the null assumption.

Since the data in this example contains ties, the exact calculation is not accurate. Therefore you should use the normal approximation with continuous correctiom.

**Critical Value**

P(X≤17)=0.02171. 0.02171<0.025.

P(X≤18)=0.02726. 0.02726>0.025.

The left critical value is**17**, and the left edge of the region of acceptance is**18**.

P(X>63)=1 - P(X≤62)=0.02171. 0.02171<0.025.

P(X>62)=1 - P(X≤61)=0.02726. 0.02726>0.025.

The right critical value is**63**, and the right edge of the region of acceptance is**62**.

You may calculate as following right=n_{1}n_{2}-left. (8*10-17=63)

Tables

Check the two-tailed statistic table, for α=0.05, n1=8, n2=10.

The critical U is 17.**P-value**

2*P(X≤18.5)=2*P(X≤18)=0.05453

Tables

For α=0.05, critical U is 17.

For α=0.1, critical U is 20.

Since 18.5 is between 17 and 20, the p-value will be between 0.05 and 0.1 .

The tool will do a logarithmic extrapolation: p-value ≈ 0.0707**Decision**

Since p-value > α (0.0707 > 0.05) or alternatively since U > U_{critical}(18.5 > 17), accept H_{0}.**Website**

The website uses**U2**instead of U.

Left critical U = 17.

Right critical U = n_{1}n_{2}- 17 = 8 * 10 - 17 = 63.

Since U2 (18.5) is in the following range: [17,63], accept H_{0}. When U2 = 17 or 63 you still accept the H_{0}.

**Critical Value**

P(X≤20)=0.02171. 0.0416<0.05.

P(X≤21)=0.0506. 0.02726>0.05.

The left critical value is**20**, and the left edge of the region of acceptance is 21.

Tables

Check the**two**tails statistic table, for α = 2 * 0.05 = 0.1, n1=8, n2=10.

The critical U is 20.**P-value**

P-value = p-value(Two tailed) / 2 = 0.0707 / 2 = 0.0354**Decision**

Since p-value < α (0.0354 < 0.05) or alternatively since U2 < U_{critical}(18.5 < 20), reject H_{0}.

**Critical Value**

P(X>60)=1 - P(X≤59)=0.04157. 0.04157<0.05.

P(X>59)=1 - P(X≤58)=0.0506. 0.0506>0.05.

The right critical value is**60**, and the right edge of the region of acceptance is 59.

Tables

Check the the**two**tails statistic table, for α = 2 * 0.05 = 0.1, n1=8, n2=10.

The value in the table is 20.

The critical U is n_1n_2 - value from the table = 8 * 10 - 20 = 60.**P-value**

P-value = 1 - p-value(Two tailed) / 2 = 1 - 0.0707 / 2 = 0.9646**Decision**

Since p-value > α (0.9646 > 0.05) or alternatively since U2 < U_{critical}(18.5 < 60), reject H_{0}.

- $$group_1: [13,13,13], \quad f_1=3.\\ group_2: [17,17], \quad f_2=2.\\ group_3: [24,24], \quad f_3=2.$$ There are 3 tie groups (t=3):

$$n=n_1+n_2=8+10=18. \\ C_{ties} = \sum_{i=1}^{t}{\frac{f_t^3-f_t}{n^3-n}}\\ C_{ties} = {\frac{3^3-3+2^3-2+2^3-2}{18^3-18}}\\ C_{ties} =\frac{36}{5814}=\frac{2}{323}=0.00619$$ - $$ \mu_u= \frac {n_1n_2}{2}=\frac {8*10}{2}=40 $$ $$ \sigma_u^2= \frac {n_1 n_2(n_1 + n_2 + 1)} {12} (1 - C_{ties})\\ \sigma_u^2= \frac {8*10(8 + 10 + 1)} {12} (1 - 0.00619)\\ \sigma_u^2= 125.8826, \sigma_u = 11.22$$ Since the data is discrete and U2 < μ , C
_{continuity}= 0.5. $$ Z = \frac { U_2 - \mu_u + C_{continuity}} {\sigma_w} = \frac { 18.5 - 40 + 0.5} {11.22} = -1.872$$ - P( z ≤ Z) = P( z ≤ -1.872) = 0.0306

- P-value = 2 * 0.0306 = 0.0612
- Since 0.0612 > 0.05, accept H
_{0}.

- P-value = P( z ≤ -1.872) = 0.0306
- Since 0.0306 < 0.05, reject H
_{0}.

- P-value = P( z ≥ -1.872) = 1 - P( z ≤ -1.872) = 1 - 0.0306 = 0.9694
- Since 0.9604 > 0.05, accept H
_{0}.

U_{2} | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | Total |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Number of combinations | 1 | 1 | 2 | 3 | 4 | 4 | 5 | 4 | 4 | 3 | 2 | 1 | 1 | 35 |