Skip to content

Data normalization

Simone Maurizio La Cava edited this page Sep 27, 2021 · 4 revisions

Considering a set of data, the goal of normalization is to make every datapoint have the same scale.

In particular, normalized values allow the comparison of corresponding normalized values for different datasets and different subjects.

Before going on, it may be useful to read a brief introduction on the descriptive statistics if the concepts of the mean and of the standard deviation are not clear.

For example, data normalization is here used in extracting the relative Power Spectral Density, and it may be used in extracting the network measures.



Maximum normalization

The maximum normalization method allows scaling data by dividing them for the maximum value shown by the dataset, in such a way that the maximum value of the normalized dataset will be equal to 1:

The main drawback of this normalization method is that considering both maximum values, it does not handle outliers very well.



Min-max normalization

Another common normalization method is the min-max normalization (or unity-based normalization): the minimum value of the dataset is transformed into a 0, while the maximum value is transformed into a 1, and every other value becomes a decimal between these two values (i.e. it brings all values into the range [0,1]):

As for the previous one, the main drawback of this normalization method is that it does not handle outliers very well.



Z-score normalization

The z-score normalization (or standard score normalization) helps avoid the outlier effects, by considering instead the mean μ and the standard deviation (std, or sd) σ:

In particular, if the original data has a very high standard deviation, the normalized values will be closer to zero.

Its main usage is normalizing errors when the distribution parameters are known, and it works better on normal distributions.

The main drawback of this method is that the normalized data will not have exactly the same scale.

Clone this wiki locally