P-Value

P value definition

The p-value is the type I error probability, to reject a correct H₀, under the assumption that H₀ is correct.
When the p-value is small enough, less than α, the risk of type I error is low, and we reject the H₀.
In the following two-tailed example the p-value is greater than α hence we can not reject the null assumption.
Please click the "P-Value" radio button.

Is only p-value important?

Every student learns how important it is to get a significant result, but the p-value is only one component in a package.
The "power" and "effect size" are not less important.
All the below examples are fake, the made-up data was created only for demonstration. The used significance level is 0.05.

Example - High blood pressure

Several researchers looked for high blood pressure treatment.

1. Garlic helps

Research in New South Walls discovered that garlic helps to reduce blood pressure with p-value equals 0.0071.
The test power was very strong 0.996, to discover a medium effect size of 0.5.

Result	Value
P-value	0.0071
Sample size	114
Test Power	0.9964
Standardized effect size	Small, 0.26
Difference	-0.109

But the standardized effect size is small (0.26), and the unstandardized effect size is almost meaningless (0.109), reducing blood pressure by only 0.109 mmHg.

Data

x1 <- c(145.2,154.9,159.8,170.4,158,160.2,161.5,152.8,140.6,133.2,121.8,133.5,125.7,127.2,143.5,124.6,122.3,136.1,144.1,154.8,161.2,132.3,122.3,133,146.6,121.4,133.6,139.4,132.7,121.3,119.6,141.2, 155.7,166.5,132.1,144,152.2,155.8,144.6,154.9,159.8,170.3,158.1,160.3,161.4,152.8,140.4,133.1,121.4,134.3,126.1,127,144,125,121.6,135.4,144.1,155,160.5,131.8,122.4,132.7,147.1,121.9,133.5,138.6,132.5,121,120,140.9,155.6,167.3,131.9,144,152.1,155.5,145,154.6,159.5,169.8,157.8,160.1,162.3,152.9,140.8,132.7,122.2,134.1,125.9,126.8,144.3,124.9,122,135.8,144.2,154.5,160.5,131.9,121.5,132.5,146.5,122.1,134,138.6,133.2,120.8,120,141.1,155.4,166.9,132.2,143.6,151.5,155.8)
x2 <-c(144.5,155.5,160.4,170.4,158.1,160.3,162.2,153.1,141,132.8,121.8,134.2,125.7,127.3,144.3,125,122.4,135.6,144.1,154.5,160.5,131.7,121.7,132.9,146.6,121.5,134.4,138.8,133.3,121.1,119.6,141.2,155.8, 167.1,132.2,143.9,152.3,155.7,145.5,154.6,159.9,170.1,157.8,160.3,162,153.3,141.1,133.2,121.7,133.8,126,127,143.7,124.9,121.6,136.3,143.8,155,160.7,132.3,121.8,132.9,146.5,122.1,133.7,138.7,132.6,121.5,119.7,140.9,156.3,167.4,131.6,144.2,151.9,156.5,145,155.1,160,170.4,157.8,159.9,161.7,153.2,140.5,133.2,122,134.2,126.3,126.6,143.6,124.8,122.5,135.7,144,155.3,161.2,132.5,122,133.3,146.9,122.1,133.9,138.8,132.6,120.5,120.1,140.8,156.4,166.9,131.8,144.1,152.2,156.4)
t.test(x1, x2, alternative = "two.sided", paired = TRUE, mu = 0, conf.level = 0.99)

2. Daily exercise doesn't help

Research in New Victoria discovered that half an hour of daily exercise doesn't help in reducing blood pressure with a p-value equals 0.06.

Result	Value
P-value	0.06
Sample size	5
Test Power	0.034
Standardized effect size	large, 1.1
Difference	-2.1

But the test power was very weak 0.034, to discover a medium effect size of 0.5.
The standardized effect size is large (1.1), and the unstandardized effect size is meaningful (2.1), reducing blood pressure by 2.1 mmHg.

Data

x1<-c(142.7,154,157.7,171.2,158.1)
x2<-c(144.6,155.6,162.3,170.6,161.1)
t.test(x1, x2, alternative = "two.sided", paired = TRUE, mu = 0, conf.level = 0.99)

3. The seven dwarfs believe in onion

The seven dwarfs don't live together, each now lives in a different country.
When they reached the age of 140, they started suffering from high blood pressure. Since their mother used to tell them that onion cures any disease, each of them decided to conduct research, without telling his brothers.
Each dwarf uses a sample size of 114. Following the results:

Dwarf	P_value
Doc	0.09
Grumpy	0.049
Happy	0.25
Sleepy	0.17
Bashful	0.31
Sneezy	0.22
Dopey	0.39

The six dwarfs were disappointed. They left the researches in the drawer. Nobody will be interested to know that onion doesn't help to reduce blood pressure.
Grumpy publish an article in the "Nature Medicine" Journal.
Does onion help to reduce blood pressure? Multiple comparisons

Conclusion

If you need to choose one of the three treatments, what would you choose?

Onion?

The sample size that each dwarf used was large, and most of the studies concluded the onion doesn't help. I wouldn't try the onion ...

Garlic or daily exercise?

The garlic treatment is significant, but the effect size is very small, it isn't a useful treatment.

The daily exercise treatment is not Significant, but there are several reasons to prefer this treatment.
1. The test power is weak, hence the test may not have enough power to reject an incorrect null assumption. We can't know if the result is significant.
2. The p-value 0.06 is larger than the significance level 0.05, but 0.05 is not a "holy" number. You may choose a different significance level like 0.1 or 0.01 if you decided before the experiment that this is the appropriate risk. Usually, you don't have the freedom to choose the α when all the researchers in your field use the same value.
3. The effect size is large, so if H₀ is correct, there is a small probability, 0.06, that the treatment doesn't help, but if it helps, there is a high chance that it is a meaningful reduction.

The correct solution is to repeat the daily exercise research with larger a sample size.

Result	Garlic	Exercise
P-value	0.0071	0.06
Sample size	114	5
Test Power	0.9964	0.034
Standardized effect size	Small, 0.26	Large, 1.1
Difference	-0.109	-2.1

Test power Effect size

Calculators

P-value