Improve DeepSeek Generation Speed with FP8 Quantization

**Is your feature request related to a problem? Please describe.**
Support FP8 (block-quant recipe) for DeepSeek models to accelerate generation.

**Describe the solution you'd like**
A clear and concise description of what you want to happen.

**Describe alternatives you've considered**
A clear and concise description of any alternative solutions or features you've considered.

**Additional context**
Add any other context or screenshots about the feature request here.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve DeepSeek Generation Speed with FP8 Quantization #1245

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improve DeepSeek Generation Speed with FP8 Quantization #1245

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions