Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Phi 3.5 MoE #116

Merged
merged 6 commits into from
Oct 22, 2024
Merged

Add Phi 3.5 MoE #116

merged 6 commits into from
Oct 22, 2024

Conversation

DePasqualeOrg
Copy link
Contributor

@DePasqualeOrg DePasqualeOrg commented Aug 31, 2024

This is my attempt to port the Phi 3.5 MoE model from the Python implementation. Unfortunately I can't test it myself, since my MacBook doesn't have enough RAM. I marked two places in PhiMoE.swift with comments starting with !! which need to be checked. You can test this with ModelConfiguration.phi3_5MoE. Go ahead and make any necessary changes if you'd like, since I won't be able to run this myself.

@davidkoski
Copy link
Collaborator

I will give it a try on Tuesday!

@davidkoski
Copy link
Collaborator

It looks like there is a problem loading the weights:

Error: Mismatched parameter weight shape. Actual [16, 6400, 512], expected [16, 6400, 4096]

this is on SwitchLinear (maybe this error needs some more context).

SwitchLinear(bias=nil, inputDims=4096, numExperts=16, outputDims=6400)

And actually the parameters has a biases which is not expected here:

(lldb) po parameters.mapValues { $0.shape }
▿ [
  biases: [16, 6400, 64],
  scales: [16, 6400, 64],
  weight: [16, 6400, 512]
]

I need to look into this further

@davidkoski
Copy link
Collaborator

davidkoski commented Sep 4, 2024

OK, I think SwitchLinear isn't quite right -- it is missing the bias (at the very least):

and I think the dimension mismatch comes down to quantization -- the SwitchLinear isn't being replaced by SwitchLinearQuantized because it doesn't implement the protocol:

and I think QuantizedSwitchLinear will need to be a subtype of SwitchLinear for the replacement to happen -- this @ModuleInfo(key: "gate_proj") var gateProj: SwitchLinear requires that the type be SwitchLinear or a subtype. Linear/QuantizedLinear are modeled the same way.

@DePasqualeOrg
Copy link
Contributor Author

Thanks for that feedback. I've tried to make those changes, although I don't know how helpful this will be, since I unfortunately can't test it myself. If it's too much trouble and you'd prefer to focus on other priorities, don't worry about it.

@davidkoski
Copy link
Collaborator

OK, I will see if I can test/finish this -- it may be a few days before I get a chance.

@davidkoski
Copy link
Collaborator

Sorry for the long delay here -- was busy busy busy!

I fixed up the quantized switch layers -- there were some issues with how they got initialized.

The couple of lines you had commented as maybe issues were exactly the spots with problems. With a little testing everything is set there.

Model loaded -> id("mlx-community/Phi-3.5-MoE-instruct-4bit")
Starting generation ...
What is the gravity on Mars and the moon?The gravity on Mars and the Moon is weaker than Earth's due to their smaller masses and sizes.

On Mars, gravity is about 38% of Earth's gravity. Specifically, the gravitational acceleration on Mars is approximately 3.71 meters per second squared (m/s²) compared to Earth's 9.81 m/s².

On the Moon, gravity is about 16.5% of Earth's gravity
------
Prompt:     13 tokens, 12.989868 tokens/s
Generation: 100 tokens, 61.009203 tokens/s, 1.639097s

That takes about 22G to run.

I think this is ready to merge once tests pass.

@davidkoski
Copy link
Collaborator

davidkoski commented Oct 22, 2024

Actually we may want to merge in the changes from #135 once that merges...

Nope, no conflicts.

Copy link
Collaborator

@davidkoski davidkoski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution!!

@davidkoski davidkoski merged commit fad4ad0 into ml-explore:main Oct 22, 2024
3 checks passed
@DePasqualeOrg
Copy link
Contributor Author

Fantastic, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants