What is a null hypothesis?
First, we need to learn the concept of null hypothesis. In statistic, we often wish to test a null hypothesis against an alternative hypothesis using a dataset. For example, in a clinical trial of a new drug, the null hypothesis might be that the new drug is no better, on average, than the current drug. We would write
H0: there is no difference between the two drugs on average.
The alternative hypothesis might be that the new drug has a different effect, on average, compared to that of the current drug. We would write
H1: the two drugs have different effects, on average.
The alternative hypothesis might also be that the new drug is better, on average, than the current drug. In this case we would write
H1: the new drug is better than the current drug, on average.
The p value can be interpreted in terms of a hypothetical repetition of the study. Suppose the null hypothesis is true and a new dataset is obtained independently of the first dataset but using the same sampling procedure. If the new dataset is used to calculate a new value of the test statistic (same formula but new data), what is the probability that the new value will confirm the original value? This probability is the p value. Do not think the p value is the probability that your null hypothesis is actually correct as there is no way to prove or disprove a hypothesis with absolute certainty.
In the above example, we would measure the effect of the drugs in two randomly selected groups of patients. The effect could be lowering the blood pressure, so we measure the blood pressure in both groups and perform a t-test. A t-test compares two data groups to infer whether differences exist between the two groups. As part of the results, we get a p value, which is interpreted as following:
p > 0.10 No evidence against the null hypothesis. The data appear to be consistent with the null hypothesis.
0.05 < p < 0.10 Weak evidence against the null hypothesis in favor of the alternative.
0.01 < p < 0.05 Moderate evidence against the null hypothesis in favor of the alternative.
0.001 < p < 0.01 Strong evidence against the null hypothesis in favor of the alternative.
p < 0.001 Very strong evidence against the null hypothesis in favor of the alternative.
In most studies, we pick (kind of arbitrarily) a p value of .05. That is, if the data show that the null hypothesis has less than a 5% chance of being right, we say it’s wrong. That doesn’t necessarily mean that alternative hypothesis is right.This significance level of a statistical hypothesis test is a fixed probability of wrongly rejecting the null hypothesis H0, if it is in fact true. That is, we want to make the significance level as small as possible in order to protect the null hypothesis and to prevent, as far as possible, the investigator from inadvertently making false claims. Usually, the significance level is chosen to be 0.05 (or equivalently, 5%).
What are one-tail and two-tail p values?
A one-tailed test is a statistical hypothesis test in which the values for which we can reject the null hypothesis, H0 are located entirely in one tail of the probability distribution. In other words, the critical region for a one-tailed test is the set of values less than the critical value of the test, or the set of values greater than the critical value of the test. A two-tailed test is a statistical hypothesis test in which the values for which we can reject the null hypothesis, H0 are located in both tails of the probability distribution. In other words, the critical region for a two-tailed test is the set of values less than a first critical value of the test and the set of values greater than a second critical value of the test.
Considering our example of the clinical test, a two-tailed p value simply tells us about the chance the effects are different without inferring if the new drug is better than the current drug. A one-tailed p value infer that the effect of the new drug is larger than the current drug (see second alternative hypothesis).