mit-han-lab / llm-awq Public

Notifications You must be signed in to change notification settings
Fork 233
Star 2.8k

Code
Issues 149
Pull requests 8
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Issues: mit-han-lab/llm-awq

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

149 Open 54 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

vila15_demo.py need deepspeed installed in Orin

#265 opened Mar 5, 2025 by StephenChou0119

Setup.py file - Version Issues

#262 opened Feb 16, 2025 by sfatimakhan

Killed: Out of Memory on Jetson Orion

#261 opened Feb 14, 2025 by sfatimakhan

AttributeError: type object 'LlavaMetaForCausalLM' has no attribute 'prepare_inputs_labels_for_multimodal'

#260 opened Feb 7, 2025 by Shashank-singh7

ValueError: llm_cfg mm_projector_cfg vision_tower_cfg not found in the config

#259 opened Jan 26, 2025 by CHENWENJIE0423

latest autoawq version 0.2.8 seems to be broken

#258 opened Jan 25, 2025 by JohnConnor123

INT4-AWQ PPL results for LLaMA-3 model are not as expected

#257 opened Jan 23, 2025 by lisuying214

Couldn't get models run on vllm?

#256 opened Jan 21, 2025 by kzos

ValueError: llm_cfg mm_projector_cfg vision_tower_cfg not found in the config.

#250 opened Dec 28, 2024 by Wotoosh

Support NVILA 15B

#249 opened Dec 24, 2024 by anhnhust

[BUG] GPU memory used is much more in v0.2.7 than v0.2.5 while quantizing models.

#247 opened Dec 18, 2024 by GodHforever

AWQ quantization doesn't work in many opensource LLM in terms of inference efficiency

#243 opened Dec 10, 2024 by loulianzhang

Don`t work on CPU "Unable to get JIT kernel for brgemm"

#241 opened Nov 25, 2024 by andretisch

Inquiry about GPU memory usage of VILA 1.5-3b AWQ model for 12 frames video.

#240 opened Nov 18, 2024 by gj-raza

RuntimeError: CUDA error: no kernel image is available for execution on the device

#238 opened Nov 15, 2024 by new-Sunset-shimmer

Could you explain me how can I change the percentage of kept salient weights in FP16?

#237 opened Nov 15, 2024 by akylbekmaxutov

Cannot clone from Efficient-Large-Model/VILA.git, Dependency Issues with alternative

#236 opened Nov 14, 2024 by rossgreer

[QST] Why does awq write its own int3/int4 GEMM kernels instead of using CUTLASS

#235 opened Nov 11, 2024 by SimpleTheoryOfTypes

Unable to run Gradio demo: VILA with TinyChat on a local GPU server

#234 opened Nov 4, 2024 by mitraavi

Support for llava_next Architecture in LLM-AWQ (Issue with Quantizing llava-hf/llava-v1.6-mistral-7b-hf)

#233 opened Nov 1, 2024 by ShobhaRajanna

How to convert the AWQ model after the quantization into safetensors

#232 opened Oct 31, 2024 by vladimiralbrekhtccr

Regarding the issues encountered with w_bit 3 quantification

#231 opened Oct 30, 2024 by langxinspieder

About the use of calibration sets

#230 opened Oct 30, 2024 by langxinspieder

Questions on the AWQ

#229 opened Oct 23, 2024 by suhcrates-web

No video inference code

#227 opened Oct 16, 2024 by Closertodeath

Previous 1 2 3 4 5 6 Next

Previous Next

ProTip! Type g i on any issue or pull request to go back to the issue listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly