GestureLSM: Latent Shortcut based Co-Speech Gesture Generation with Spatial-Temporal Modeling

📝 Release Plans

Inference Code
Pretrained Models
A web demo
Training Code

⚒️ Installation

Build Environtment

conda create -n gesturelsm python=3.12
conda activate gesturelsm
conda install pytorch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 pytorch-cuda=11.8 -c pytorch -c nvidia
pip install -r requirements.txt
bash demo/install_mfa.sh

Download Model

# Download the pretrained model (Shortcut) + (Diffusion) + (RVQ-VAEs)
gdown https://drive.google.com/drive/folders/1OfYWWJbaXal6q7LttQlYKWAy0KTwkPRw?usp=drive_link -O ./ckpt --folder

# Download the SMPL model
gdown https://drive.google.com/drive/folders/1MCks7CMNBtAzU2XihYezNmiGT_6pWex8?usp=drive_link -O ./datasets/hub --folder

Download Dataset

For evaluation and training, not necessary for running a web demo or inference.

Download the original raw data

bash preprocess/bash_raw_cospeech_download.sh

Eval

Require download dataset

python test.py -c configs/shortcut_rvqvae_128.yaml

Demo

python demo.py -c configs/shortcut_rvqvae_128_hf.yaml

🙏 Acknowledgments

Thanks to SynTalker, EMAGE, DiffuseStyleGesture, our code is partially borrowing from them. Please check these useful repos.

📖 Citation

If you find our code or paper helps, please consider citing:

@misc{liu2025gesturelsmlatentshortcutbased,
      title={GestureLSM: Latent Shortcut based Co-Speech Gesture Generation with Spatial-Temporal Modeling}, 
      author={Pinxin Liu and Luchuan Song and Junhua Huang and Chenliang Xu},
      year={2025},
      eprint={2501.18898},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2501.18898}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
configs		configs
dataloaders		dataloaders
mean_std		mean_std
models		models
optimizers		optimizers
preprocess		preprocess
static		static
utils		utils
weights		weights
README.md		README.md
arxiv.pdf		arxiv.pdf
demo.py		demo.py
diffuser_rvqvae_trainer.py		diffuser_rvqvae_trainer.py
index.html		index.html
requirements.txt		requirements.txt
shortcut_rvqvae_trainer.py		shortcut_rvqvae_trainer.py
test.py		test.py
test_sample_t.py		test_sample_t.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GestureLSM: Latent Shortcut based Co-Speech Gesture Generation with Spatial-Temporal Modeling

📝 Release Plans

⚒️ Installation

Build Environtment

Download Model

Download Dataset

Eval

Demo

🙏 Acknowledgments

📖 Citation

About

Releases

Packages

Languages

rohitmhnty/GestureGen

Folders and files

Latest commit

History

Repository files navigation

GestureLSM: Latent Shortcut based Co-Speech Gesture Generation with Spatial-Temporal Modeling

📝 Release Plans

⚒️ Installation

Build Environtment

Download Model

Download Dataset

Eval

Demo

🙏 Acknowledgments

📖 Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages