aws-neuron / transformers-neuronx Public

Notifications You must be signed in to change notification settings
Fork 29
Star 101

Code
Issues 22
Pull requests 9
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Issues: aws-neuron/transformers-neuronx

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

22 Open 50 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Llama and Mistral model inputs change after using different computation

#104 opened Dec 20, 2024 by kahfizulkifli

Loading compiled fails: model_type=bert -> transformers being used in compiled config.

#102 opened Dec 2, 2024 by michaelfeil

Neuron model NEFFs are dependent on the python path

#99 opened Oct 7, 2024 by dacorvo

Gibberish output for princeton-nlp/Sheared-LLaMA-1.3B with continuous batching

#94 opened Jul 15, 2024 by pinak-p

Not able to load llama 3 70b on inf2.24xlarge instance

#92 opened Jul 10, 2024 by sangraamp

Any plan to support Qwen-2 Model

#89 opened Jun 27, 2024 by mynewstart

llava support

#88 opened Jun 20, 2024 by sonic182

Latest changes introduced for continuous batching break Mixtral model bug

Something isn't working

#84 opened Apr 15, 2024 by dacorvo

Add support for Baichuan-13B model

#83 opened Apr 9, 2024 by cszhz

Add support for gemma models

#82 opened Apr 2, 2024 by benglewis

Improve Neuron model loading time

#80 opened Mar 13, 2024 by dacorvo

Backward compatibility with saved llama 2 compiled artifacts enhancement

New feature or request

#78 opened Jan 18, 2024 by dacorvo

User feedback when compiling and reloading a large model enhancement

New feature or request

#76 opened Jan 17, 2024 by dacorvo

Support for MPT model

#74 opened Jan 11, 2024 by klutzDrawers

Infering logits from model.forward for the entire batch instead of the last forward's output. documentation

Improvements or additions to documentation

#73 opened Jan 10, 2024 by michaelfeil

Generate Llama 2 from Embeddings

#72 opened Jan 8, 2024 by liechtym

Vicuna13B model support

#66 opened Dec 19, 2023 by petrovicu

Any solution to save the converted model? enhancement

New feature or request

#29 opened Aug 14, 2023 by aliseyfi

Discrepancies Between GPU and Neuron-based Outputs for GPTJ Model on inf2.24xlarge

#28 opened Aug 13, 2023 by ho4040

Running GPT-NeoX on inf2.24xlarge kills kernel

#24 opened Jul 22, 2023 by aliseyfi

Support for Falcon models

#18 opened Jul 9, 2023 by user-10389

BART support

#11 opened Jun 23, 2023 by sinking-point

ProTip! Updated in the last three days: updated:>2024-12-22.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly