This repo contains the code and data for Advancing SMoE for Continuous Domain Adaptation of MLLMs: Adaptive Router and Domain-Specific Loss (ACL-2025). A novel framework for Continuous Domain Adaptation (CDA) of Multimodal Large Language Models (MLLMs) that improves the model's domain learning ability while avoiding catastrophic forgetting.
Continuous domain learning: Medicine → Chart → Math
sh MLLM_CDA/scripts/v1_5/finetune_task_moe_med.sh
sh MLLM_CDA/scripts/v1_5/finetune_task_moe_chart.sh
sh MLLM_CDA/cripts/v1_5/finetune_task_moe_math.sh
Load the SMoE modules for all domains and test the model on each domain:
sh MLLM_CDA/scripts/v1_5/eval/mult_domain_med.sh
sh MLLM_CDA/scripts/v1_5/eval/mult_domain_chart.sh
sh MLLM_CDA/cripts/v1_5/eval/mult_domain_math.sh
Code was adapted from LLaVA-1.5, a training framework for a family of open large multimodal models.
Our method is mainly implemented in the file "LLaVA/llava/moelib/layers.py
MoE_MLP()
function is designed to add the SMoE module to the FFN sublayer of LLMs.AR_loss_T()
function implements our domain specific autoregressive lossToken_balance_los()
function calculates our expert balance loss.