What is the central limit theorem?
The average of large number of non-normal identical independent random variables usually distributes similar to the normal distribution.
As a result the sum or the proportion of large enough sample may distributes similar to the normal distribution.
Why is this important?
Many tests like t-test, ANOVA, F test for variances assumes a normal population.
It is easy to do calculations using the normal distribution, like confidence interval calculation, or any other probability calculation.
Example
You want to know the true mean of unknown distribution, what is the average of the entire population, not only the sample?.
Following the limitations for the classical CLT, there are different versions for the CLT that deal with part of the limitations.
What sample size is large enough ?
In some cases a sample size of 30 observations is sufficient to assume the average distribute normally, for example for the t-test with reasonably symmetric data it is usually enough.
But as you may see in the below F(6,6) example, for highly skewed data, with undefined skewness level, only with 5000 observations the average distribute similar to normally. And But as you may see in the below F(3,3) example, for highly skewed data, with undefined skewness level, even the average of 100,000 kas skewed distribution.
We used R statistical programming language to create a simulation of several distributions.
n - sample size
SAMPLE[n] - vector, with n values.
AVERAGES[repeats] - vector with repeats = 100,000 values. We chose 100,000 for a smooth histogram, 10,000 would also do a similar job.
Following process, for each sample size n:
In the following examples you may see that when the sample size is large enough the distribution is similar to the normal distribution.
The F distribution is similar to the bell shape with different levels of skewness and kurtosis. $$F\ Skewness=\frac{(2df_1+df_2-2)\sqrt{8(df_2-4)}}{(df_2-6)\sqrt{df_1(df_1+df_2-2)}},\ df_2>6$$
When DF2≤6 the skewness of the original distribution is not defined.
Following a simulation results of F distribution with df1=6 and df2=6.
The calculated skewness( F(6,6) ) is undefined, and the simulation skewness is xxx.
The calculated excess kurtosis( F(6,6) ) is undefined, and the simulation kurtosis is xxx..
Only with extra large sample size: n=5,000, the data is similar to the normal distribution.
Following a simulation results of F distribution with df1=7 and df2=7.
The calculated skewness( F(7,7) ) is 10.1559 .
The calculated excess kurtosis( F(7,7) ) is -166.714.
Only with large sample size: n=500, the data is similar to the normal distribution.
Following a simulation results of F distribution with df1=6 and df2=12.
The calculated skewness( F(6,12) ) is 2.9938 .
The calculated excess kurtosis( F(6,12) ) is 23.1667.
With small sample size: n=50, the data is similar to the normal distribution.
Following a simulation results of F distribution with df1=30 and df2=47.
The calculated skewness( F(30,47) ) is 1.0444 .
The calculated excess kurtosis( F(30,47) ) is 1.8889.
With small sample size: n=10, the data is similar to the normal distribution.
When DF2≤4 the variance of the original distribution is not defined and when DF2≤6 the skewness of the original distribution is not defined.
Following a simulation results of F distribution with df1=3 and df2=3.
The calculated skewness( F(3,3) ) is undefined .
The calculated excess kurtosis( F(3,3) ) is undefined.
You can see that even with a large sample size, n=100,000, the data is still extremely skewed.