Small fixes and refactoring #1861

mseeger · 2024-12-07T17:09:26Z

Mostly refactoring and small fixes:

config.sliding_window_layer_period instead of overwriting config.sliding_window_layer_placing (str to int)
Comment on config.attention_logit_softcapping: Will hit performance if this is used
cos, sin passed down in GPT.forward must have batch dimension if input_pos is None. This probably worked because PyTorch is adding singleton dimensions at the front of a shape for broadcasting, but it did confuse me.
Block.forward: Missed self.post_mlp_norm if parallel_residual. BTW: None of your models have parallel_residual = True, is anybody using this?

Not doing anything new. Feel free to reject, but this would help to make the code easier to understand.

Andrei-Aksionov · 2024-12-08T16:01:38Z

Hello @mseeger

Thanks for the PR!

Block.forward: Missed self.post_mlp_norm if parallel_residual.

Nice catch! 🙂

BTW: None of your models have parallel_residual = True, is anybody using this?

Some older models supported parallel_residual. In #1821 some of such models were dropped. In one of the future PRs, I guess, we will drop an unused code. That, for instance, should make MLP class easier to read.

mseeger · 2024-12-11T13:33:58Z

How does it work for this project? I suppose somebody needs to review. I don't see why the one test (Thunder) is failing.

litgpt/config.py

litgpt/model.py

Andrei-Aksionov · 2024-12-15T11:22:56Z

Hello @mseeger

Thanks for the PR. Looks good 🚀.
I only left a couple of nits.

I don't see why the one test (Thunder) is failing.

I'll fix it today.

mseeger · 2024-12-15T13:42:19Z

OK, incorporated requested changes

Andrei-Aksionov · 2024-12-24T19:24:32Z

Hello @mseeger

Sorry for the delay.
Now the code is ready to merge.

Thanks again for the PR! 🚀

mseeger requested review from rasbt and lantiga as code owners December 7, 2024 17:09

mseeger force-pushed the small_fixes branch from 4cc9f5d to 4ba0bce Compare December 7, 2024 19:43

mseeger force-pushed the small_fixes branch 2 times, most recently from f37bb78 to 9573be6 Compare December 9, 2024 07:42

Andrei-Aksionov approved these changes Dec 15, 2024

View reviewed changes

litgpt/config.py Outdated Show resolved Hide resolved

litgpt/model.py Outdated Show resolved Hide resolved

Small fixes and refactoring

ecccb97

mseeger force-pushed the small_fixes branch from c01b31a to ecccb97 Compare December 15, 2024 13:41

Adjust assertion for rope meta

19604c3

Andrei-Aksionov merged commit db308ba into Lightning-AI:main Dec 24, 2024
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Small fixes and refactoring #1861

Small fixes and refactoring #1861

mseeger commented Dec 7, 2024 •

edited

Loading

Andrei-Aksionov commented Dec 8, 2024

mseeger commented Dec 11, 2024

Andrei-Aksionov commented Dec 15, 2024

mseeger commented Dec 15, 2024

Andrei-Aksionov commented Dec 24, 2024

Small fixes and refactoring #1861

Small fixes and refactoring #1861

Conversation

mseeger commented Dec 7, 2024 • edited Loading

Andrei-Aksionov commented Dec 8, 2024

mseeger commented Dec 11, 2024

Andrei-Aksionov commented Dec 15, 2024

mseeger commented Dec 15, 2024

Andrei-Aksionov commented Dec 24, 2024

mseeger commented Dec 7, 2024 •

edited

Loading