Skip to content

MR-MT3: Memory Retaining Multi-Track Music Transcription to Mitigate Instrument Leakage

License

Notifications You must be signed in to change notification settings

gudgud96/MR-MT3

Folders and files

NameName
Last commit message
Last commit date

Latest commit

826ea84 · Jul 12, 2024

History

88 Commits
Jun 10, 2024
Jun 6, 2024
Apr 8, 2024
Apr 8, 2024
Jul 12, 2023
Oct 2, 2023
Jan 29, 2024
Jun 10, 2024
Oct 20, 2023
Apr 22, 2022
Jan 29, 2024
Apr 8, 2024
Jun 13, 2023
Jul 12, 2024
Feb 2, 2024
Oct 20, 2023
Apr 8, 2024
Oct 20, 2023
Apr 8, 2024
Oct 20, 2023
Jun 10, 2024
Jun 10, 2024
Jun 10, 2024
Jun 10, 2024
Oct 26, 2023

Repository files navigation

MR-MT3

Code accompanying paper: MR-MT3: Memory Retaining Multi-Track Music Transcription to Mitigate Instrument Leakage.

Setup steps

conda create --name mrmt3 python==3.10 -y
conda activate mrmt3
python -m pip install --upgrade pip
python -m pip install transformers==4.18.0
python -m pip install torch
python -m pip install librosa==0.9.1
python -m pip install t5==0.9.3
python -m pip install note-seq==0.0.3
python -m pip install pretty-midi==0.2.9
python -m pip install einops==0.4.1
python -m pip install ddsp                                 <- changed : Dont specify version, when installing python 3.10, Because of llvmlite, numba
python -m pip install tensorflow==2.11
python -m pip install tensorflow-text==2.11
python -m pip install protobuf==3.20
python -m pip install numpy==1.23.5                        <- changed : Scipy requires numpy>=1.23.5
python -m pip install hydra-core==1.2.0
python -m pip install typing_extensions
python -m pip install torchaudio                           <- changed : required 
python -m pip install tensorflow_probability==0.19.0       <- changed : tensorflow==2.11 requires downgrading this library 
python -m pip install pytorch_lightning                    <- changed : for import the model

Next, you need to download the required datasets and postprocess them:

For Slakh

  1. Re-sample Slakh .flac to 16kHz - python3 resample.py.

  2. Create the grouped stem version as ground truth instead of the existing all_src.mid. Some bass notes have octave errors - python3 midi_script.py.

  3. python3 tools/generate_inst_names.py

For ComMU

  1. Download ComMU dataset - https://github.com/POZAlabs/ComMU-code/tree/master/dataset

  2. cd scripts/commu/ -> ./process_commu_dataset.sh

For NSynth

  1. Download NSynth dataset validation split.

  2. cd scripts/nsynth/ -> python3 convert_nsynth_json_to_midi.py

Training

Refer to train.sh for a list of train commands corresponding to all of our experiments.

Evaluation

You can download our pretrained models at: https://huggingface.co/gudgud1014/MR-MT3/tree/main

Refer to test.sh, for a list of test commands corresponding to all of our experiments.

Basically, each command runs test.py that:

  • transcribe MIDI files based on a given eval.audio_dir;
  • compute multi-instrument F1 score w.r.t. the ground truth MIDI.

License

MIT

Citations

If you find our research useful, kindly cite us at:

@article{tan2024mr,
  title={MR-MT3: Memory Retaining Multi-Track Music Transcription to Mitigate Instrument Leakage},
  author={Tan, Hao Hao and Cheuk, Kin Wai and Cho, Taemin and Liao, Wei-Hsiang and Mitsufuji, Yuki},
  journal={arXiv preprint arXiv:2403.10024},
  year={2024}
}

Credits

Huge shoutout to @kunato as we largely based our initial MT3 experiments on his implementation - https://github.com/kunato/mt3-pytorch.

About

MR-MT3: Memory Retaining Multi-Track Music Transcription to Mitigate Instrument Leakage

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published