-
Notifications
You must be signed in to change notification settings - Fork 794
Migrate CoreMLQuantizer to ET #16473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16473
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New Failures, 1 Unrelated FailureAs of commit 24d83be with merge base 913436a ( NEW FAILURES - The following jobs have failed:
UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
|
@metascroy has imported this pull request. If you are a Meta employee, you can view this in D90200393. |
|
|
||
| from torchao.quantization.pt2e.fake_quantize import FakeQuantize as _FakeQuantize | ||
|
|
||
| from torchao.quantization.pt2e.observer import ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better for these to continue to use torch.ao since we are planning to deprecate these in torchao/pt2e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you say more on the deprecation plan?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh wait, I remember there might be some incompatibilities of the observer in torchao/pt2e v.s. torch.ao
does the previous torch.ao import work for CoreMLQuantizer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't try it, I assumed it wouldn't work. In all other quantizers we migrated to use observers in torchao.ao
Were the observers removed from torch.ao?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I think they don't work.
observers are not removed from torch.ao, it's just we'd like to deprecate them together with all the fx / eager flows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just updated my PR: apple/coremltools#2634, it seems that coreml is currently using the same observer for both fx and pt2e flow, and since torchao pt2e uses a different set of observer/fake_quant, we can't make all tests pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any other concerns with this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @jerryzh168
Summary
Migrate
CoreMLQuantizerto use observer/fake-quantize primitives fromtorchaoinstead oftorch.ao.quantization.Background
The files in
backends/apple/coreml/quantizer/were initially copied from the corresponding quantizer files in coremltools. This PR updates those files to use the newtorchaoquantization primitives.Motivation
The quantization primitives in
torch.ao.quantizationare being deprecated in favor of thetorchaolibrary. This PR updates the CoreML backend's quantizer to use the newtorchao.quantization.pt2eAPIs for observer and fake-quantize operations, ensuring compatibility with the PyTorch ecosystem going forward.Changes
_annotation_config.pyUpdated imports to use
torchao.quantization.pt2e:FakeQuantizefromtorchao.quantization.pt2e.fake_quantizeMinMaxObserver,MovingAverageMinMaxObserver,PerChannelMinMaxObserver,MovingAveragePerChannelMinMaxObserver) fromtorchao.quantization.pt2e.observerQuantizationSpecfromtorchao.quantization.pt2e.quantizer_coreml_quantizer.pyUpdated imports to use:
Quantizerbase class fromtorchao.quantization.pt2e.quantizer.quantizerget_module_name_filterfromtorchao.quantization.pt2e.quantizer.utils_coreml_quantizer_utils.pyUpdated imports and usage:
QuantizationAnnotation,QuantizationSpec,QuantizationSpecBase,SharedQuantizationSpec,FixedQParamsQuantizationSpec, andQ_ANNOTATION_KEYfromtorchao.quantization.pt2e.quantizer.quantizerget_module_name_filterfromtorchao.quantization.pt2e.quantizer.utils_get_aten_graph_module_for_patternfromtorchao.quantization.pt2e.utils"quantization_annotation"strings withQ_ANNOTATION_KEYconstant for consistencytest_coreml_quantizer.pyUpdated test imports:
prepare_pt2e,prepare_qat_pt2e,convert_pt2efromtorchao.quantization.pt2e.quantize_pt2eFakeQuantizeBasefromtorchao.quantization.pt2e.fake_quantizetest_fake_quantize_modules_inserted_after_preparetest to verifyFakeQuantizeBasemodules are correctly inserted after the prepare stepNote
Configuration classes (
LinearQuantizerConfig,ModuleLinearQuantizerConfig,QuantizationScheme) remain imported fromcoremltools.optimize.torchas they are config-level abstractions that don't depend on the deprecatedtorch.aoprimitives.Test Plan
test_fake_quantize_modules_inserted_after_preparethat verifiesFakeQuantizeBasemodules from torchao are correctly inserted after bothprepare_pt2e(PTQ) andprepare_qat_pt2e(QAT) stepstest_conv_relu,test_linear) continue to pass, validating end-to-end quantization flow with the new torchao primitives