This repository contains code of a graph-variational auto-encoder, adapted to connectivity matrices representing functional
connectivity of the brain. The trained model can be used to extract connectivity related features which can serve as input for downstream tasks.
More information can be found in our paper "Graph auto-encoder to identify distinct connectivity
patterns associated with language performance after epilepsy surgery". Trained models for five different seeds and different
cut-off values for connectivity matrices are also provided. These trained models can be applied to extract subject-specific
features (embedding) for new datasets independent of dataset size.
The data used to train the model is part of the 1200 subjects release of the Human Connectome Project. In particular we used the ICA-FIX denoised dataset, which includes only grayordinates and already went through initial preprocessing. More information on the release can be found here. To access some of the subject-related information (f.e. zygosity) Restricted Access Data Use Terms have to be accepted (See Quick Reference: Open Access vs Restricted Data). Data can be downloaded from ConnectomeDB after creating a user account.
It is assumed that the downloaded data is already mapped to freesurfer's fsavarage4 surface in mgh-format and stored in two folders corresponding to data download:
- root/folder/of/HCP/data/HCP_3T_RESTA_fmri
- root/folder/of/HCP/data/HCP_3T_RESTB_fmri.
We also applied bandpass filtering and global signal regression, which is also indicated in the file name (f.e. lh.rfMRI_REST1_LR_Atlas_hp2000_clean_bpss_gsr_fs4.mgh).
To map from HCP-space to fsaverage4 you can follow this manual. It requires Connectome Workbench and Freesurfer, which can be installed for free. The freesurfer command mris_convert is useful to convert between image file formats.
- settigs.py: set paths of folder to store connectivity matrices and time-series
- HCP_create_ts.py: extract time_series based on Schafer 100 parcellation, the parcellation can be found in the Deliveries- folder.
- correlation.py: calculate connectivity matrices, the threshold needs to be set in the script
- HCP_train.py: file to train a new model
- get_embedding.py: use a trained auto-encoder to extract individual embeddings for subjects which can be used for a downstream classification task
In the deliveries folder we provide files, which are needed by the scripts.
- the parcellation files Schaefer2018_Parcels_7Networks_oder.annot for left and right hemisphere: The parcellation for fsaverage5 was taken from here and mapped through freesurfer's mri_surf2surf command to fsaverage4
- Subjects.csv: It contains the ID of subjects used in our paper. When running the scripts, it will be regenerated containing additional subject-related information
- 0verticeslh.npy and 0verticesrh.npy: vertices without MRI signal in HCP data, they are on the medial part of both hemispheres
This folder contains trained models with different seeds and different connectome densities.
- full: all positve correlations are kept
- top50: positive correlation values above 50% -ile are kept, each correlation value is only considered once
- top90: positive correlation values above 90% -ile are kept, each correlation value is only considered once