Skip to content

Measure Extraction

Simone Maurizio La Cava edited this page Feb 4, 2022 · 26 revisions

The first step in the Athena's guided pipeline is the Measure Extraction

You just have to choose the measure and its parameters, and the toolbox will extract it for every epoch and every location of every subject.



The directory and the measure

First of all, in the following interface you have to select the directory which contains all your signals and the wished measure.

Here you can insert the name of the directory (with its path) in the text box, or search it on your computer through the button next to it.

Note that if you have switched between study modalities, for example if you watched the changes during the time of your signals or in their power spectra, you will find the previously setted data directory in the text box, so you can change or use it again.

As for every interface, if in every moment you need to consult the guide, you can push the logo button, which will open the wiki of the toolbox in a web browser.

In order to extract a measure, you can click on the pop menu to open the list of measures and choose one of them, then push the RUN button to open the parameters interface.

At the end of this document, you can find a list of the measures and their explanaton.

If you want insead to jump this step, maybe because you did the extraction in a previous session, you can push the Next > button and go directly to the Temporal and Spatial average step and continue the study.





Parameters

In the following interface, you can set all the parameters that you want to use in order to extract the measure for all your signals.

The required parameters are:
  • Sampling frequency
  • Cut frequencies
  • Number of epochs
  • Time for epoch
  • Starting time
  • Total band

All the measures have the same parameters, except the PSDr which also needs the total band parameter.

The Sampling frequency is the number of samples per seconds of a signal.

Athena is able to detect automatically its value if it is present inside the data file, but if there is not a sampling frequency value, it will be shown a 0 value you have to insert it in the respective text box: in this case the toolbox assume this value for every signal.

If the toolbox is able to read this value, you will find the value of the first signal in your data directory in the text box.

When you extract the measure, every signal will be resampled to this value before to proceed if the have a different sampling value, in order to obtain a normalization of the time series.

Even if you find a value in the sampling frequency text box, you can change it: the toolbox will resample each signal with this one.

The cut frequencies represent the limit of each frequency band to use in estimating the measure.

You can insert their values in the respective text box, using an empty space between them through the space bar on your keyboard.

The toolbox will estimate the measure values of each frequency band in order to study to study them.

Let's make an example: if you want to extract the Theta band (4-8 Hz), the Alpha band (8-13 Hz) and the Beta band (13-30 Hz), you can insert 4 8 13 30 (please, note that the ranges for these bands may differ in some researches)

The number of epochs identify the number of diffent non-overlapped segments on which evaluate the chosen measure, while the time for epoch identify the number of seconds for each one.

If the amount of time of the first signal is higher than 12 seconds, the interface will suggest you the values for the number of epochs and their time length because the time lenght of the time window has an effect on the extraction of the measures: it will be shown the maximum number of epochs of 12 seconds (up to 5) and the maximum time for each one.

However, you can always change these values and the toolbox will help you to avoid overcoming the length of the signal.


The starting time is the initial time of the first epoch.

For example, it may be useful in cases such as time series relative to extracted sources, because these computations often imply some effects for the first and the last samples.





The total band is a parameter used in the PSDr extraction only, because it represents the relativization frequency band for this measure.

To know its use, you can read in the following list, which briefly explains all the extractable measures.

The measures

Currently, you can extract 6 types of measures:

  • Power measures
  • Connectivity measures
  • Aperiodic measures
  • Entropy measures
  • Autocorrelation measures
  • Statistical information



As power measure, you can extract the relative PSD (PSDr, relative power spectral density).

This measure consist in an evaluation of the area under the power spectrum between two cut frequencies, and the relativization is a normalization obtained by dividing the total power contribution of a certain band with respect to the total band (relativization band).

This measure has been chosen instead of a classical power spectral density because it reduces influence of the errors in the arrangement of the electrodes on the scalp or by small gain differences in the chain of amplification, even if it leads to other disadvantages due to the consequent inter-correlation between the frequency bands (the increase of one contribution determines a false reduction of the others).

Furthermore, this measure is estimed through the Welch's method.



As connectivity measures, you can extract the correlation coefficient, the PLV (phase locking value), the PLI (phase lag index), the coherence (also known as magnitude squared coherence), the coherency, the mutual information the AEC (amplitude envelope correlation) and the corrected AEC (AECc, or orthogonalized AEC, AECo).

Note that the connectivity measure are filtered before the extraction of the measure.



The correlation coefficient is a statistical measure which estimates the linear correlation between two variables (representing two time series, in this case).

The range of values for this measure is between 1 and -1, identifying a total positive correlation and a total negative correlation, respectively, while a value equal to 0 identify that there is not any linear correlation between the considered variables.

Here, the correlation coefficient is computed as its absolute value, so it is evaluated only the amplitude of the correlation, without considering its sign.



The PLV (phase locking value) is a functional connectivity metric that depends on the instantaneous phase of the signals, and it is used in order to investigate on long range synchronization changes inducted by the neural activity.

This measure evaluates the difference between the istantaneous phases of the signals of two different locations, and assume them as functionally connected if this difference remains more or less constant.

In particular, PLV represents the absolute value of the averaged phase difference between two signals, and the result of the related computation can be normalized in order to obtain a value in the range between 0 and 1, which represents respectively no interaction and maximum interaction.

Due to its nature, this measure results to be very influenced by the Volume conduction.



The PLI (phase lag index) is a measure of the asymmetry related to the distributions of the phase differences between two signals, and it reflects the coupling between these time series, searching for a constant phase delay in the asymmetry of the instantaneous phase differences.

Disregarding phase locking that is centered around 0 phase difference, the computation of this measure excludes Volume conduction effects and results to be generally less correlated to the power measures than other measures such as the PLV.

This index can assume values between 0 and 1, indicating no interaction and maximum interaction, respectively.

A pitfall of this measure is that it may often miss true connections, for example due to small lags or frequency non-stationarities.



The coherence evaluates similarity in the frequency domain, verifying how similar are the power spectra of two different signals.

Its value is directly proportional to the similarity level, so that two identical signals will have a coherence value equal to 1, while two completely independent time series will have a coherence value of 0.

As the PLV, the Volume conduction impacts also the coherence, and in particular it results in high values of this measure, especially for closer electrodes.

Furthermore, coherence can be affected by the amplitude covatiation of the signals, increasing its value as the amplitude covariance increases.



The mutual information between two random variables measures the mutual dependence between them, quantifying the amount of information obtained by a variable observing the other and vice versa.

It is computed by evaluating the Discretized Entropy (you can find a brief explaination below, under the Entropy measures section) of each time series, here representing a random variable, using the average of the bins which are considered for computing the different entropies values, and then adding one entropy value to the other and subtracting the joint entropy (the amount of information which are equal between the two signals).



The AEC (amplitude envelope correlation) estimates the degree to which two envelope fluctuations in cortical oscillations are temporally correlated.

Its corrected version, the AECc (corrected AEC), is obtained by performing a time-domain orthogonalization procedure on the standard AEC measure.



As aperiodic measures, this toolbox is able to extract the Exponent and the Offset parameters.

These measures are aperiodic background parameters of the signal spectrum, arrhytmic components which reflect its 1/f activity.

In particular, the Offset represents the y-intercept of the model fit, while the Exponent is the exponent χ in the 1/fχ formulation (it is equivalent to the slope of a linear fit in the log-log space, with a sign flip).

These measures are extracted in a single frequency band (so, if you insert more than 2 cut frequencies, the toolbox will evaluate them in a single frequency band between the lower and the higher cut frequency).



The entropy is a measure which quantifies the uncertainty (so, the complexity), and its concept was introduced by Shannon in a statistical sense into information theory (i.e. statistical entropy).

This concept can also be used in biomedical signals, in order to evaluate the amount of uncertainty, so in some sense the amount of information contained in the time series, both in the time-domain and in the frequency-domain.



Considering the entropy in the time-domain, Athena allows to evaluate the Discretized Entropy, the Sample Entropy and the Approximate Entropy, and both aim at quantify the predictability of the time series.

Essentially, they evaluates the similarity of segments of signals, giving as result a value which is as higher as the less regular the signal is (for example, random signals tend to have higher entropy values).

The main difference between Sample Entropy and Approximate Entropy is that the latter involves self-matching and does pairwise matching of similarity between segments.

However, both depend on a tolerance value, r, which is essentially a threshold for similarity (0.2 times the standard deviation of the time series by default), and on an embedding dimension, m, which is the length of the compared segments (typically equal to 2 or 3, and the first value is here considered by default).



The Discretized Entropy is computed by evaluating the probabilities which fall in a set of bins, computed through the Freedman‐Diaconis rule, and then applying the entropy formula:



As entropy in the frequency-domain measure, it is currently available the Spectral Entropy (SE), a measure which is able to statistically quantify the amount of uncertainty in the pattern of the normalized power distribution of the signal, which is treated as a probability distribution of which has to be computed the Shannon Entropy (or Information Entropy).

This measure is related to the information theory, and can be also seen as the amount of information contained in the signal spectrum itself.



The autocorrelation is the correlation of a time series with a delayed copy of itself, and represents the similarity between delayed "pieces of them" (observations) as a function of the delay (time lag) between them.

The Hurst exponent (or Hurst coefficient) is a measure of long-term memory of signals, and it is related to the autocorrelations of them and the rate at which these decrease as the lag between pairs of values increases.



The statistical information is related to descriptive statistical measure, such as mean, median, variance, standard deviation, kurtosis and skewness, also described in the related wiki page.





Power estimation

During the extraction of the aperiodic measures and of the relative PSD, the power spectra of the signals are computed through the Welch's power because of its advantaged with respect the FFT (Fast Fourier Transform).

Essentially, this method slices the original signal in several windows, which can be overlapped or non-overlapped, and averages the spectra of these.

This Welch's method is able to smooth over non-systematic noise, so it shows more robustness to noise than the FFT, and it is also less sensitive to some non-stationarities, even if it has a reduced spectral precision.





The next step

At the end of the extraction, a subfolder with the name of the measure will be created inside the data directory: it will contain a data file for every subject, with the selected measure extracted for each frequency band and for all the locations, and a text file which contains information about the used parameters.

Now you can extract another measure or go on with the temporal and spatial average step.

Clone this wiki locally