-
Notifications
You must be signed in to change notification settings - Fork 530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Paper: Mamba Models a replacement for Transformers? #917
Conversation
added my name
added SSM and S4 sections
Signed-off-by: saike148 <[email protected]>
Signed-off-by: saike148 <[email protected]>
added key differences and updated the equation formatting
added scipy and mamba synergy
…proceedings into subhros_paper
Subhros paper
image position change
added saikrishna myst data
added mamba banner
initial edits
…proceedings into subhros_paper
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the great paper. State-space models are something I have used in my own career and I am amazed that they can be applied in this way. Is it possible to share the URL of the GitHub repository you used for this model in this paper? I am sure it would be even better if you could share it so we can actually implement this idea.
I can share the code written originally by the authors of MAMBA https://github.com/state-spaces/mamba |
Thanks, I was making that statement based on the fact that SciPy papers often emphasize being able to run the code and reproduce it. Do I need to run both Transformer and mamba to understand the results of this paper? I think this paper could be an excellent introduction to mamba. To that end, I think it would be good to add a link to it appropriately (including Transformers) :) |
Sure I can add those links in the paper. I will make these changes right away. |
@tkoyama010 please add any further changes if needed after review. |
Hi @tkoyama010 and @HaoZeke! In case a little extra time is needed, the initial complete review deadline has been extended to next Wednesday, July 3rd. |
Yes please. Thanks! |
Hi @tkoyama010, all working and good! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
added clarity for vanishing gradient problem
This reverts commit cebe7d3.
Hi everyone @tkoyama010 @HaoZeke @ameyxd @mepa , I hope you’re all doing well. If there are any additional suggestions, feedback, or corrections needed before the author revision period ends on August 7th, please let me know. I’d be happy to make the necessary changes and incorporate your feedback. Thank you very much! |
Hi @tkoyama010 and @HaoZeke - Do you feel that this paper is ready for inclusion in the Proceedings? @tkoyama010, I see that you have approved the PR so will assume "yes" unless I hear otherwise. Thanks for reviewing! |
Yes! I am ready for it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for working on this!
LGTM as well; go ahead :)
…On 9/2/24 2:49 PM, Tetsuo Koyama ***@***.***> wrote:
Yes! I am ready for it.
—
Reply to this email directly, view it on GitHub
<#917 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABBCUTZJ4J6M74DRK7J7IWTZUR3GNAVCNFSM6AAAAABITCUAZ2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMRUHEYTQNBXGQ>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Thanks a lot @tkoyama010 and @HaoZeke for the approval!! |
Thanks very much for reviewing, @tkoyama010 and @HaoZeke. |
If you are creating this PR in order to submit a draft of your paper, please name your PR with
Paper: <title>
. An editor will then add apaper
label and GitHub Actions will be run to check and build your paper.See the project readme for more information.
Editor: Meghann Agarwal @mepa
Reviewers: