-
Notifications
You must be signed in to change notification settings - Fork 5
Data
The minimal data required to use the fitting routine is a vector of frequencies and a vector of powers for the power spectral density. This is shown in fit_single.m
.
For state tracking, the .mat
data files are initially created using bt.data.import_raw_eeg
. The variables generated by import_raw_eeg
are:
-
t
, a vector corresponding to seconds elapsed. There are as manyt
values as there are spectra -
f
, a vector of frequencies -
colheaders
, a cell array containing the electrode names, as saved in the raw data (these may therefore be different for different recordings. It is up tobt.data.electrode_positions
to recognize them) -
s
, a matrix of power spectral density. Its size isf x t x colheaders
- that is, it stores the power spectrum for all electrodes at all points in time. Slices can then be taken across whichever dimensions are desired -
state_score
, a vector the same size ast
, which records which stage of sleep was predominant. If an odd number of spectra are used (e.g. 4s windows averaged to give 30s blocks means 27 spectra are used) then there is never any ambiguity which state is dominant. Thestate_score
is a number corresponding to a row inbt_utils.state_cdata
-
state_str
, a cell array the same size ast
, which stores titles for the plots. By convention, this string also records the breakdown of which states were averaged in the 30s block - for example, 'W-1 (14/16)' means that there were 14 seconds of wake, and 16 seconds of S1 sleep in the block. -
nspec
, a vector the same size ast
, which stores how many of the 4s spectra were averaged to yield the 30s block spectrum. -
n_reject
, a vector the same size ascolheaders
, stores the number of 4s spectra rejected for that channel.
The power spectrum calculation and artifact rejection is implemented in mcmc.get_tfs
. The code in mcmc.get_tfs
operates on single electrodes, and this function is called by bt.data.import_raw_eeg
, which iterates over each electrodes. Note that this means that artifact detection in the current framework only uses information from one electrode at a time.
The power spectrum is computed in the following sequence
- First,
utils.rfft
(from thecorticothalamic-model
repository) is used to compute a sequence of spectra in short windows, by default 4s. For each window, the standard deviation of the voltage is computed. - Next, for each spectrum, the delta power is computed. If the delta power lies outside a range of values determined by the distribution of delta powers across all 4s windows, then the spectrum is rejected. If the standard deviation of the voltage exceeds a threshold determined by the distribution of voltage standard deviations across all 4s windows, then the spectrum is rejected. If the voltage does not change for a time period greater than
run_threshold
(this can happen due to problems with the recording device), then the corresponding spectrum is rejected. - Sets of the 4s spectra are averaged together to give 30s spectra. For a 30s window, all of the 4s spectra contained within it are selected. Those that are not rejected are averaged together to give the 30 spectrum. The number of spectra that were averaged is recorded in
nspec
. Again, it is up to the fit wrapper to deicde what to do withnspec
.
Contaminated 4s spectra are automatically excluded by get_tfs
and are therefore not seen by later stages of processing - in fact, the 4s spectra are created in get_tfs
but are only used internally and never returned. However, the 30s spectra are always produced, unless there are more than 26 consecutive rejected 4s spectra. If there ever is a run of more than 30s of contaminated 4s spectra, a sensible strategy would probably be to return an array of NaNs as the spectrum. When the MCMC routine is then run, it will return NaN for chisq, the same as if the parameters were not allowed. As before, it would be up to the fit wrapper e.g. fit_cluster.m
to decide what to do if an NaN spectrum is encountered.
The .mat
files generated by import_raw_eeg
are loaded for fitting using bt.core.load_subject_data
. This function detects sleep onset and truncates the start of the recordings, and also supports selecting only a subset of the electrodes. The fit data is returned in a struct containing fields:
t
nspec
s
state_str
state_score
start_idx
which are directly analogous to those in the raw .mat
file, except that they only contain data for the requested electrodes, and the t
vector (and all others) start at start_idx
from the raw data.
Any data structure matching this format can be used with BrainTrak
Note that state_score
primarily determines the colour of the plots, and can be set to an arbitrary value (as long as it is an integer that indexes a row in bt_utils.state_cdata
) for data that does not have sleep stage information. Similarly, state_str
is primarily used as the title for plots and in sleep-specific analysis routines, and can be set to arbitrary strings for data that does not have sleep stage information.
To work with a new data set, the general strategy is to get Matlab arrays corresponding to the time series for each electrode. This step can be nontrivial, depending on the format of the raw data, and typically involves processing with software like EEGLAB
.
After that point, bt.data.get_tfs
can be used to perform the standard FFT and artifact rejection, and the state information can be converted from the data source or else dummy data can be used.
Finally, the data can be loaded through bt.core.load_subject
if desired, or else an entirely different loading function can be used (such as data_examples/load_br_data.m
) - the only requirement is that this function returns a struct matching the format detailed above.