Skip to content

Statistical analysis

Simone Maurizio La Cava edited this page Aug 14, 2020 · 11 revisions

The statistical analysis allows you to explore your data in order to discover underlying patterns and differences between the analyzed groups.

Currently, you can execute:

  • Distributions analysis, where you can observe and compare the distributions of a measure of the two groups of subjects

  • U Test, where you can execute the Wilcoxon-Mann-Whitney test in order to check if there are significant differences in your data

  • Measures correlation, where you can verify if there is any relationship between two different measures, or between different parameters of the same measure

  • Index correlation, where you can verify if there is any relationship between a measure and an external index

  • Histogram analysis, where you can observe and compare the histogram of the distributions of a measure of the two groups of subjects

  • Descriptive Statistics, where you can evaluate some statistical values such as the mean, the variance and the kurtosis, and compare these values related to the studied groups of subjects

If you do not remember what a statistical analysis is, you can always go on the button with the analysis name with the cursor of your mouse: a brief tooltip about it will be showed to you.

Whenever you need, you can click on the logo button to open this page of the wiki on a web browser.

Before to start computing your statistical analysis, it may be useful to read a brief introduction to some of them, otherwise you can move on one of the available statistical analysis, or return to the previous interface in order to execute other analysis.





The p-value

Before talking about the analysis, it may be useful to introduce the concept of p-value.

The p-value, or probability value, is the probability of obtaining test results at least as extreme as the results actually observed during the test, assuming that the null hypothesis is correct.

In practice, the smaller the p-value, the higher the significance because it tells you that the considered hypothesis may not adequately explain the observation, and the null hypothesis H is rejected if any of these probabilities is less than or equal to a value, called α-value.

This value is a threshold value, which is arbitrarily pre-defined, and identifeis the level of significance.

The α-value is commonly set to 0.05, but can also be reduced in order to increase the conservativeness of the analysis.



The correlation

The correlation is a statistical relationship between two variables, which represents a measure of the dependence between them.

This form of dependence can be causal or not, and represents a statistical association which identify the degree to which the cosidered pair of variables are related.


The rank correlation measures an ordinal association, so the relationship between rankings of different ordinal variables, and evaluates the degree of similarity between two rankings in order to assess the significance of the relation between them.

The ranking is the assignment of an order to different observations of a particular variable, so that it is possible to say that an element has ranking higher than (or lower than, or equal to) another element.


The correlation analysis allows you to explore your data in order to discover underlying patterns and differences between the analyzed sets of data.

There are many correlation analysis, which evaluates different correlation coefficients (measures of the degree of the correlation).


Athena uses the Spearman's rank correlation coefficient ρ (rho), which is a nonparametric measure of the rank correlation between two variables, and verify how well the relationship between two variables can be described using a monotonic function.

This relationship can be linear or not and, if there are no repeated data values, a perfect Spearman correlation of ±1 occurs when each of the variables has an identical rank, and so if each one is a perfect monotone function of the other.

However, if the observations between the two variables have a similar rank, the Spearman correlation between two variables will be high between the two variables, and low when observations have a dissimilar rank between the two variables.





Statistical hypothesis testing

A statistical hypothesis is a hypothesis that is testable on the basis of observing a process represented through some variables.

A statistical hypothesis test is a method used in order to verify the statistical relationship between the two datasets, idealizing a null hypothesis that proposes no relationship between them: this comparison is considered statistically significant if the relationship between the sets of data has a probability value (p-value) lower than a threshold value (α-value) that the null hypothesis is true (so, it can be rejected).


These tests can be parametric or non-parametric.

The difference between them is that the parametric tests are based on known distributions of the analyzed data, while the non-parametric tests are distribution-free or are based on distributions which parameters are not specified.

Clone this wiki locally