Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttMoE代码问题——未提供修改后的MoE模型代码 #8

Open
CXL-edu opened this issue Mar 23, 2024 · 1 comment
Open

AttMoE代码问题——未提供修改后的MoE模型代码 #8

CXL-edu opened this issue Mar 23, 2024 · 1 comment

Comments

@CXL-edu
Copy link

CXL-edu commented Mar 23, 2024

图片

这里是MoE是用的https://github.com/XiuzeZhou/mixture-of-experts这个代码吗?也就是https://github.com/davidmrau/mixture-of-experts。但是你在AttMoE-NASA.ipynbAttMoE-CALCE.ipynb中定义了下面模型。其中MoE的参数不是原始库的参数,不知道后续修改了哪些部分,且您的库中没有提供修改后的代码和原始MoE代码的来源。

from mixture_of_experts import MoE

class AttMoE(nn.Module):
    def __init__(self, feature_size=16, hidden_dim=8, num_layers=1, nhead=4, dropout=0., dropout_rate=0.2, 
                 num_experts=8, device='cpu'):
        super(AttMoE, self).__init__()
        self.feature_size, self.hidden_dim = feature_size, hidden_dim
        self.dropout = nn.Dropout(dropout_rate)
        self.cell = Attention(feature_size=feature_size, hidden_dim=hidden_dim, nhead=nhead, dropout=dropout)
        self.linear = nn.Linear(hidden_dim, 1)
        
        experts = nn.Linear(hidden_dim, hidden_dim)
        # create moe layers based on the number of experts
        self.moe = MoE(dim=hidden_dim, num_experts=num_experts, experts=experts)
        self.moe = self.moe.to(device)
 
    def forward(self, x): 
        out = self.dropout(x)
        out = self.cell(x)   # cell 输出 shape (batch_size, seq_len=1, feature_size)
        out,_ = self.moe(out)
        out = out.reshape(-1, self.hidden_dim) # (batch_size, hidden_dim)
        out = self.linear(out)  # shape: (batch_size, 1)
        
        return out
@XiuzeZhou
Copy link
Owner

mixture-of-experts库的github: https://github.com/lucidrains/mixture-of-experts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants