Skip to content

Code for "Advancing SMoE for Continuous Domain Adaptation of MLLMs: Adaptive Router and Domain-Specific Loss" (ACL 2025)

License

Notifications You must be signed in to change notification settings

XMUDeepLIT/MLLM-CDA

Repository files navigation

MLLMs-CDA

This repo contains the code and data for Advancing SMoE for Continuous Domain Adaptation of MLLMs: Adaptive Router and Domain-Specific Loss (ACL-2025). A novel framework for Continuous Domain Adaptation (CDA) of Multimodal Large Language Models (MLLMs) that improves the model's domain learning ability while avoiding catastrophic forgetting.

Training

Continuous domain learning: Medicine → Chart → Math

sh MLLM_CDA/scripts/v1_5/finetune_task_moe_med.sh

sh MLLM_CDA/scripts/v1_5/finetune_task_moe_chart.sh

sh MLLM_CDA/cripts/v1_5/finetune_task_moe_math.sh

Inference & Evaluation

Load the SMoE modules for all domains and test the model on each domain:

sh MLLM_CDA/scripts/v1_5/eval/mult_domain_med.sh

sh MLLM_CDA/scripts/v1_5/eval/mult_domain_chart.sh

sh MLLM_CDA/cripts/v1_5/eval/mult_domain_math.sh

Note

Code was adapted from LLaVA-1.5, a training framework for a family of open large multimodal models.

Our method is mainly implemented in the file "LLaVA/llava/moelib/layers.py

  • MoE_MLP() function is designed to add the SMoE module to the FFN sublayer of LLMs.
  • AR_loss_T() function implements our domain specific autoregressive loss
  • Token_balance_los() function calculates our expert balance loss.

About

Code for "Advancing SMoE for Continuous Domain Adaptation of MLLMs: Adaptive Router and Domain-Specific Loss" (ACL 2025)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published