llm_eval support for NVFP4 post-quant checkpoint

Hello all,
Can Modelopt enable the wikitext-like task-based accuracy test for the quantized output model for NVFP4?
The export model exists some shape fusion due to the pack mechanism for 2 FP4 into 1 INT8, differing from the original model structure.
How can LLM_eval support the quantized model?

//==================================//
python lm_eval_hf.py 
       --model hf  \
       --model_args pretrained=  \
       --quant_cfg NVFP4_DEFAULT_CFG \   
       --tasks wikitext   \
       --batch_size 4
Does the cmd shown above support the exported model test?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llm_eval support for NVFP4 post-quant checkpoint #1002

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

llm_eval support for NVFP4 post-quant checkpoint #1002

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions