-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes to parse DynamicQuantizeLinear #2896
Conversation
- Remove preemptive fold at beginning of parse for inputs - Fix error with sub/division used for zero point - add proper reduce_max/min to get scales
@pfultz2 not seeing much of a perf boost with this relative to develop if at all. Bigger perf boost is with the other changeset when we add in the pass. |
This build is OK for merge ✅ |
🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #2896 +/- ##
========================================
Coverage 91.84% 91.84%
========================================
Files 478 478
Lines 18179 18181 +2
========================================
+ Hits 16696 16698 +2
Misses 1483 1483 ☔ View full report in Codecov by Sentry. |
Observed 20% increase in runs when using these likely due to removal of flatten.Separated out from #2826 as we're not observing mismatches off 6.1 and develop anymore for uint8/int8 when dynamicquantizelinear is used