Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Migraphx EP] Static int8 QDQ support (#17931) #23

Merged
merged 1 commit into from
Nov 10, 2023

Conversation

TedThemistokleous
Copy link

Description

Adding static int8 quantization support for MIGraphX Execution Provider

  • Allows for parsing in calibration tables generated by Onnxruntime or TensorRT's toolsets
  • Add proper environment variables into the MIGraphX EP
  • Update python API to include updating execution provider flags -> was missing on python side
  • Hook into MIGraphX's int8 quantitation and optimization of models

Motivation and Context

Required so that we can get onnxruntime to pass in models while leveraging the existing tooling for int8 static QDQ quantization.

First step in a series of PRs which will add further static quantization on the operator level as MIGraphX releases further support.

These changes drew heavily from the tensorRT EP should allow for similar functionality for GPU based (versus CPU) quantization of models before an inference is performed.


Description

Motivation and Context

### Description
<!-- Describe your changes. -->
Adding static int8 quantization support for MIGraphX Execution Provider

- Allows for parsing in calibration tables generated by Onnxruntime or
TensorRT's toolsets
- Add proper environment variables into the MIGraphX EP
- Update python API to include updating execution provider flags -> was
missing on python side
- Hook into MIGraphX's int8 quantitation and optimization of models

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Required so that we can get onnxruntime to pass in models while
leveraging the existing tooling for int8 static QDQ quantization.

First step in a series of PRs which will add further static quantization
on the operator level as MIGraphX releases further support.

These changes drew heavily from the tensorRT EP should allow for similar
functionality for GPU based (versus CPU) quantization of models before
an inference is performed.

---------

Co-authored-by: Ted Themistokleous <[email protected]>
Co-authored-by: Ted Themistokleous <[email protected]>
@TedThemistokleous TedThemistokleous added the enhancement New feature or request label Nov 10, 2023
@TedThemistokleous TedThemistokleous self-assigned this Nov 10, 2023
@jeffdaily jeffdaily merged commit 8fdf6a7 into rocm6.0_internal_testing Nov 10, 2023
12 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants