Skip to content

descriptive statistical analysis

Simone Maurizio La Cava edited this page Aug 14, 2020 · 5 revisions

The descriptive statistical analysis allows you to compute some descriptive statistical values and to compare the two analyzed groups.

Currently, you can compute:

  • The mean

  • The median

  • The variance

  • The maximum value

  • The minimum value

  • the kurtosis

  • The skewness

In particular, they are evaluated on the distribution related to a chosen measure, selecting the frequency band and the location which have to be analyzed.

Whenever you need, you can click on the logo button to open this page of the wiki on a web browser.

The following paragraphs will give some explaination about the computed values, but if you are already comfortable with them, you can RUN your analysis or return to the previous interface and choose another analysis to execute.





The mean, the maximum and the minimum

Also known as expected value, the mean is the central tendency of a distribution. Here, it is essentially computed as the average of the values related to the data vector, obtained from the data matrix realted to a measure, selecting the frequency band and the location.

The maximum value is the higher value of the elements of this vector, while the minimum value is the lower one.

Taking as example a vector of five elements:

[1, 4, 7, 8, 9]

The sum of its 5 elements is 29, so its mean is 29 / 5 = 5.8, while the maximum value is 9 and the minimum value is 1.





The median

The median is a value separating the lower half from the higher one of the data vector, in this case.

Taking as example a vector with an odd number of ordered elements:

[1, 4, 7, 8, 9]

The median value is the "middle" value, so 7 in this case.

If the vector has instead even elements, the median is composed as the mean of the two "middle" elements (for example, the median value of the vector [1, 4, 7, 8, 9, 9] is 7.5).

The basic advantage of the median compared to the mean is that it is not skewed so much by outliers ("atypical" values, data points that differ significantly from other observations), and so it may give a better idea of a "typical" value.





The variance

The variance is the expectation of the squared deviation of a random variable from its mean value, so it gives an idea of how much the elements are spread out from their mean value.

It can be obtained by summing the square of each difference between the elements of the vector and the mean of the whole vector, then dividing the resulting value by the number of elements of the vector itself.

Taking as example the vector [1, 4, 7, 8, 9] with its previously computed mean value equal to 5.8, the variance can be computed by obtaining the vector of values resulting by the subtraction of the mean from each elements ([-4.8,-1.8, 1.2, 2.2, 3.2]), squaring them ([23.04, 3.24, 1.44, 4.84, 10.24]), summing them (42.8) and finally dividing this value for the number of elements (42.8/5), obtaining 8.56.





The kurtosis

The kurtosis is a measure which describes the shape of a probability distribution, and it is related to the tails of the distribution.

The kurtosis of any univariate normal distribution is 3, and distributions with kurtosis less than 3 produce fewer and less extreme outliers than does the normal distribution (an example is the uniform distribution, which does not produce outliers), while distributions with kurtosis greater than 3 produce more outliers than the normal distribution (an example is the Laplace distribution, which has tails that asymptotically approach zero more slowly than a Gaussian).

There are different measures of kurtosis, and they may have different interpretations.

Here, the kurtosis is computed as the fourth central moment of the data vector, divided by fourth power of its standard deviation.





The skewness

The skewness is a measure of the asymmetry of the distribution about its mean, and it can be positive, zero, negative, or undefined.

For a unimodal distribution, such as the ones evaluated here, a positive value commonly indicates that the tail is on the right side of the distribution, and a negative one indicates that the tail is on the left, while a zero value means that the tails of both sides of the mean balance out overall.

However, this is true both for symmetric distributions (and in these cases the mean is equal to the median) and for asymmetric distributiosn where one tail is "fat and short" while the other is "thin and long", and this concept is applied also in case of values which are different from zero.

Here, the skewness is computed as the third central moment of the data vector, divided by the cube of its standard deviation.

Clone this wiki locally