Kurtosis
The coefficient of Kurtosis is a measure for the degree of peakedness/flatness in the variable distribution. If
the corresponding P value is low (P<0.05) then the variable peakedness is significantly different from that
of a Normal distribution, which has a coefficient of Kurtosis equal to 0 (Sheskin, 2004).
Platykurtic distribution
Normal distribution
Leptokurtic distribution
Low degree of peakedness
Mesokurtic distribution
High degree of peakedness
Kurtosis <0
Kurtosis = 0
Kurtosis >0
Test for Normal distribution:
The result of the test for Normal distribution is expressed as accept Normality or reject Normality , with P
value. If P is higher than 0.05, it may be assumed that the data have a Normal distribution and the
conclusion `accept Normality' is displayed.
If the P value is less than 0.05, then the hypothesis that the distribution of the observations in the sample is
Normal, should be rejected, and the conclusion `reject Normality' is displayed. In the latter case, the
sample cannot accurately be described by arithmetic mean and standard deviation, and such samples
should not be submitted to any parametrical statistical test or procedure, such as e.g. a t test. To test the
possible difference between not Normally distributed samples, the Wilcoxon test can be used, and
correlation can be estimated by means of rank correlation.
When the sample size is small, it may not be possible to perform the selected test and an appropriate
message will appear.
Percentiles (or centiles ): when you have n observations, and these are sorted from smaller to larger,
then the p th percentile is equal to the observation with rank number:
p
n
R(p) =
5
.
0
(Lentner,
1982)
100
When the rank number R(p) is a whole number, then the percentile coincides with the sample value; if R(p)
is a fraction, then the percentile lies between the values with ranks adjacent to R(p) and in this case
MedCalc uses interpolation to calculate the percentile.
The formula for R(p) is only valid when
1
p
n
1
n 100
n
E.g. the 5th and 95th percentiles can only be estimated when n
20, since
1
5
95
20
1
and
20 100
100
20
Therefore it makes to sense to quote the 5th and 95th percentiles when the sample size is less then 20. In
this case it is advised to quote the 10th and 90th percentiles, at least if the sample size is not less than 10.
The percentiles can be interpreted as follows: p % of the observations lie below the p th percentile, e.g.
10% of the observations lie below the 10th percentile.
The 25th percentile is called the 1st quartile, the 50th percentile is the 2nd quartile (and equals the
Median), and the 75th percentile is the 3rd quartile.
The numerical difference between the 25th and 75 percentile is the interquartile range. Within the 2.5th
and 97.5th percentiles lie 95% of the values and this range is called the 95% central range. The 90%
central range is defined by the 5th and 95th percentiles, and the 10th and 90th percentiles define the 80%
central range.
46
New Page 1