Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small fixes and refactoring #1861

Merged
merged 2 commits into from
Dec 24, 2024
Merged

Conversation

mseeger
Copy link
Contributor

@mseeger mseeger commented Dec 7, 2024

Mostly refactoring and small fixes:

  • config.sliding_window_layer_period instead of overwriting config.sliding_window_layer_placing (str to int)
  • Comment on config.attention_logit_softcapping: Will hit performance if this is used
  • cos, sin passed down in GPT.forward must have batch dimension if input_pos is None. This probably worked because PyTorch is adding singleton dimensions at the front of a shape for broadcasting, but it did confuse me.
  • Block.forward: Missed self.post_mlp_norm if parallel_residual. BTW: None of your models have parallel_residual = True, is anybody using this?

Not doing anything new. Feel free to reject, but this would help to make the code easier to understand.

@Andrei-Aksionov
Copy link
Collaborator

Hello @mseeger

Thanks for the PR!

Block.forward: Missed self.post_mlp_norm if parallel_residual.

Nice catch! 🙂

BTW: None of your models have parallel_residual = True, is anybody using this?

Some older models supported parallel_residual. In #1821 some of such models were dropped. In one of the future PRs, I guess, we will drop an unused code. That, for instance, should make MLP class easier to read.

@mseeger mseeger force-pushed the small_fixes branch 2 times, most recently from f37bb78 to 9573be6 Compare December 9, 2024 07:42
@mseeger
Copy link
Contributor Author

mseeger commented Dec 11, 2024

How does it work for this project? I suppose somebody needs to review. I don't see why the one test (Thunder) is failing.

litgpt/config.py Outdated Show resolved Hide resolved
litgpt/model.py Outdated Show resolved Hide resolved
@Andrei-Aksionov
Copy link
Collaborator

Hello @mseeger

Thanks for the PR. Looks good 🚀.
I only left a couple of nits.

I don't see why the one test (Thunder) is failing.

I'll fix it today.

@mseeger
Copy link
Contributor Author

mseeger commented Dec 15, 2024

OK, incorporated requested changes

@Andrei-Aksionov
Copy link
Collaborator

Hello @mseeger

Sorry for the delay.
Now the code is ready to merge.

Thanks again for the PR! 🚀

@Andrei-Aksionov Andrei-Aksionov merged commit db308ba into Lightning-AI:main Dec 24, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants