# Part 7: Normality

In statistics, we use normality tests to determine whether a data set follows a normal distribution or not, or to compute how likely an underlying random variable is to be normally distributed. We have already learned that with the choice between parametric and nonparametric tests it is very important to know whether our data follows a normal distribution or not. For example, we can use more powerful parametric tests if our data follows a normal distribution. We can apply various tests to check for normality.

Anderson-Darling test

The Anderson-Darling test uses a specific distribution in calculating critical values making it a more sensitive test. For this reason, it is one of the most powerful statistical tools for detecting most departures from normality, and is highly recommended by MaxStat.

Shapiro-Wilk test

The Shapiro-Wilk test calculates a W statistic that tests whether a random sample comes from a normal distribution. Small values of W are evidence of departure from normality. This test has done very well in comparison studies with other normality tests, and MaxStat recommends its application with the Anderson-Darling test. However, the Shapiro-Wilk test should not be applied if the group of data contain many identical values. It is also best suited with a size of less than 5000 values in the data group.

Kolmogorov-Smirnov test (KS)

The KS test has been traditionally used to test for normality. However, we know now that its performance to do so is very poor. The KS test based only on the largest discrepancy between the cumulative distribution of the data and the normal distribution. Such „single parameter approach“ lead to insensitivity assessing normality, and MaxStat provides KS test simply for comparison and consistency with other computational approaches. The KS test should never been used as a single procedure to test for normality.

Lilliefors test

This test is a modification of the Kolmogorov-Smirnov test and is suited to normal cases where the parameters of the distribution, the mean and the variance are not known and have to be estimated. The Lilliefors test is very sensitive to outliers.

Jarque-Bera test

The Jarque-Bera test is a goodness-of-fit test of whether sample data have the skewness and kurtosis matching a normal distribution. This test is more powerful the higher the number of values.

MaxStat uses all above normality tests to check if your data follow a normal distribution. Sheet with results are shown below. To test hypotheses between variable A and C we can use more powerful parametric tests, but non-parametric tests have to be used to test hypotheses between the other variables.