Learning latent representations on ECoG.
Recommended: within a conda env
git clone --recurse-submodules [email protected]:b4yuan/ecog2vec.git
pip install -r requirements.txt
cd packages
pip install -e ecog2vec
pip install -e fairseq
pip install -e utils_jgm
pip install -e machine_learning
pip install -e ecog2txt
ecog2vec
|-- manifest
|-- model
|-- notebooks
|-- packages
|   |-- ecog2txt
|   |-- ecog2vec
|   |-- fairseq
|   |-- machine_learning
|   |-- utils_jgm
|-- runs
|-- wav2vec_inputs
|-- wav2vec outputs
|-- wav2vec_tfrecords
Three main jupyter notebooks in notebooks/:
- ecog2vec: This notebook trains a- wav2vecmodel on ECoG and extracts features.
- vec2txt: This notebook runs- ecog2txtto decode from the- wav2vecfeatures.
- _original_tf_records.ipynb: This notebook creates 'original' tf_records as inputs to- ecog2txtto measure a baseline performance.
To allow w2v to accept inputs with multiple channels--two files need changed in the fairseq package:
- https://github.com/b4yuan/fairseq/blob/9660fea38cfcdbec67e0e3aba8d7907023a36aa2/fairseq/data/audio/raw_audio_dataset.py#L138
- https://github.com/b4yuan/fairseq/blob/9660fea38cfcdbec67e0e3aba8d7907023a36aa2/fairseq/models/wav2vec/wav2vec.py#L391
Set at 256 by default. Change to # of electrodes for the patient, less the bad electrodes.