TODO

MorphoSim: An Interactive, Controllable, and Editable Language-guided 4D World Simulator

👤 Xuehai He* · Shijie Zhou* · Thivyanth Venkateswaran · Kaizhi Zheng · Ziyu Wan · Achuta Kadambi · Xin Eric Wang

🌐 Project Website | 🔗 arXiv | 🎬 Demo Video

TODO

Environment Setup

bash env.sh
conda activate morphosim

Environment (CUDA 11.8 example)

# Conda (recommended)
conda create -n morphosim python=3.10 -y
conda activate morphosim

# PyTorch + CUDA 11.8
conda install pytorch==2.0.1 torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia -y
pip3 install -U xformers==0.0.28.post1 --index-url https://download.pytorch.org/whl/cu118
FORCE_CUDA=1 pip install "git+https://github.com/facebookresearch/pytorch3d.git"

# Project deps
python -m pip install pip==24.1.2
pip install -r requirements.txt

# Optional extra deps used in scripts
pip install torch-sparse -f https://data.pyg.org/whl/torch-2.4.1+cu118.html
pip install torch-scatter -f https://data.pyg.org/whl/torch-2.4.1+cu118.html
pip install torch-cluster -f https://data.pyg.org/whl/torch-2.4.1+cu118.html
pip install torch_geometric

Configure credentials and paths

Create a .env file in the repo root and set tokens as needed:
- HF_TOKEN/HUGGING_FACE_HUB_TOKEN for Hugging Face models
- OPENROUTER_API_KEY (if using OpenRouter)
- Adjust DATA_ROOT and OUTPUT_ROOT if your data/output locations differ

Data layout

./data
└── davis_dev
    └── train
        └── preprocess
            ├── images
            └── semantic_features

Prepare Features

# Prepare dataset
python prepare.py --config ./configs/wild/prepare_davis.yaml --src ./data/davis_dev/train

# Extract features
python internvideo_chat_feature/internvideo_extract_feat.py --video_path ./data/davis_dev/train/preprocess
python sam2/sam2_extract_feat.py --video_path ./data/davis_dev/train/preprocess
cd lseg_encoder && \
python -u encode_images.py --backbone clip_vitl16_384 --weights demo_e200.ckpt \
  --widehead --no-scaleinv \
  --outdir ./data/davis_dev/train/preprocess/semantic_features/rgb_feature_langseg \
  --test-rgb-dir ./data/davis_dev/train/preprocess/images --workers 0 && cd ..

# Train and visualize
python run.py --config ./configs/wild/davis.yaml --src ./data/davis_dev/train
python viz.py --config ./configs/wild/davis.yaml --root ./output/train/32_channel/final_viz --save ./output/train/final_viz

Run directional guided video generation first

Reffering to code here: https://github.com/eric-ai-lab/Mojito

Editing with viz_agent.py (three edit types)

viz_agent.py supports three operations via natural language prompts:

Color change (color_func)
Deletion (remove an object)
Extraction (isolate an object)

General form:

python viz_agent.py \
  --config ./configs/wild/davis.yaml \
  --root ./output/<scene>/32_channel/final_viz \
  --user_prompt "<instruction>"

Color change example:

python viz_agent.py --config ./configs/wild/davis.yaml \
  --root ./output/cows/32_channel/final_viz \
  --user_prompt "change the cow color to purple"

Deletion example:

python viz_agent.py --config ./configs/wild/davis.yaml \
  --root ./output/cows/32_channel/final_viz \
  --user_prompt "delete the cow"

Extraction example:

python viz_agent.py --config ./configs/wild/davis.yaml \
  --root ./output/cows/32_channel/final_viz \
  --user_prompt "extract the cow"

Notes:

Set OPENROUTER_API_KEY in your environment if using the default openrouter provider, or configure Azure variables if using --api xh-gpt4.1.
Outputs are saved under --output_root (default ./output) in a subfolder named agentic_edit.

Individual steps

Prepare dataset:

python prepare.py --config ./configs/wild/prepare_davis.yaml --src ./data/davis_dev/train

InternVideo2 features:

cd internvideo_chat_feature
python internvideo_extract_feat.py --video_path ./data/davis_dev/train/preprocess
cd ..

SAM2 features:

cd sam2
python sam2_extract_feat.py --video_path ./data/davis_dev/train/preprocess
cd ..

LangSeg features:

cd lseg_encoder
python -u encode_images.py --backbone clip_vitl16_384 --weights demo_e200.ckpt \
  --widehead --no-scaleinv \
  --outdir ./data/davis_dev/train/preprocess/semantic_features/rgb_feature_langseg \
  --test-rgb-dir ./data/davis_dev/train/preprocess/images --workers 0
cd ..

Train and visualize:

python run.py --config ./configs/wild/davis.yaml --src ./data/davis_dev/train
python viz.py --config ./configs/wild/davis.yaml --root ./output/train/32_channel/final_viz --save ./output/train/final_viz

Demo video

Citation

@misc{he2025morphosiminteractivecontrollableeditable,
      title={MorphoSim: An Interactive, Controllable, and Editable Language-guided 4D World Simulator}, 
      author={Xuehai He and Shijie Zhou and Thivyanth Venkateswaran and Kaizhi Zheng and Ziyu Wan and Achuta Kadambi and Xin Eric Wang},
      year={2025},
      eprint={2510.04390},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2510.04390}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
configs		configs
figure_tools		figure_tools
internvideo_chat_feature		internvideo_chat_feature
lib_4d		lib_4d
lib_data		lib_data
lib_prior		lib_prior
lib_render		lib_render
lseg_encoder		lseg_encoder
sam1		sam1
sam2		sam2
scripts		scripts
tools		tools
MorphoSim.mp4		MorphoSim.mp4
README.md		README.md
env.sh		env.sh
inference_rendered_result.py		inference_rendered_result.py
jax_requirements.txt		jax_requirements.txt
prepare.py		prepare.py
prepare_feature.sh		prepare_feature.sh
requirements.txt		requirements.txt
resize.py		resize.py
run.py		run.py
run.sh		run.sh
run_all.sh		run_all.sh
test.py		test.py
viz.py		viz.py
viz_agent.py		viz_agent.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MorphoSim: An Interactive, Controllable, and Editable Language-guided 4D World Simulator

TODO

Environment Setup

Environment (CUDA 11.8 example)

Configure credentials and paths

Data layout

Prepare Features

Run directional guided video generation first

Editing with viz_agent.py (three edit types)

Individual steps

Demo video

Citation

About

Uh oh!

Releases

Packages

Languages

eric-ai-lab/Morph4D

Folders and files

Latest commit

History

Repository files navigation

MorphoSim: An Interactive, Controllable, and Editable Language-guided 4D World Simulator

TODO

Environment Setup

Environment (CUDA 11.8 example)

Configure credentials and paths

Data layout

Prepare Features

Run directional guided video generation first

Editing with viz_agent.py (three edit types)

Individual steps

Demo video

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages