-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parser changes to handle MatMulIntegerToFloat #3445
base: develop
Are you sure you want to change the base?
Conversation
TODO:
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #3445 +/- ##
===========================================
- Coverage 92.23% 92.21% -0.02%
===========================================
Files 514 514
Lines 21746 21810 +64
===========================================
+ Hits 20057 20113 +56
- Misses 1689 1697 +8 ☔ View full report in Codecov by Sentry. |
Updated parser to handle bias case as well as bad scale conditions Initial float/half tests bad scale tests bad bias tests
avoid tidy screaming about complexity
74f8ae0
to
cdb307d
Compare
Use dequantizelinear which elminates the need to add in shifts due to int8/uint8 mismatches still needs parser tests
MIGRAPHX_THROW("PARSE_QUANT_DOT_SCALED: Bias have same dim as matrix B column"); | ||
} | ||
|
||
has_valid_scale_bias = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As against invalid
? ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If scale bias doesn't exist there isn't a bias at the end of the matmulintergertofloat added then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was simply wondering if has_scale_bias
isn't what the intent is? :-)
return dequantized_op; | ||
} | ||
|
||
static instruction_ref handle_scaled_output(const onnx_parser::node_info& info, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too many parameters. Ideally they should be handled by a struct
parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They're the same amount of a parameters gathered by the operator. These are all needed for dequantize steps and adding the proper unsqueeze->transpose paths. Order matters here with respect to matrix input A or B
Use the parsed in op name for error messages to help logging should parser errors occur.
Change naming to be agnostic of input index.
42b787d
to
9660e11
Compare
bool a1_has_no_zp = (a1 == zp_a1); | ||
|
||
auto unsq_scale_a0 = info.add_instruction(make_op("unsqueeze", {{"axes", {-1}}}), scale_a0); | ||
if(not a0_has_no_zp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Nit) Style: perhaps two negatives are not required, if there is a variable like a0_has_zp
.
Clean up uint8 handling for quant_dot. Fix tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for following up the comments. Approved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From our conversation, need to test/handle higher dimensional matrix contractions (matrix mul). Also transpose with permutation = {0, 1}
probably does nothing.
Remove the transpose ops here as this was masking an issue related to input checks for the scale and bias inputs. Supposed to match input for the scale value based on the input column value instead of the row. With this in mind we can remove the transpose here. Updated parser tests to handle this correclty. Retested and validated output since transpose changes the math here.
You're right but I think when I tried this wasn't doing what I thought it was in conjunction with some the squeeze at -1 . Fixed this an realized I don't need the transpose here. Also I think this should handle N-dim now since everything is broadcasted on a per column basis for scale/bias inputs which should be matched to the column of the input matrix. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: would be good to have a higher ndim test
unsq_zp_a0 = info.add_instruction(make_op("unsqueeze", {{"axes", {0}}}), zp_a0); | ||
if(zp_a0->get_shape().scalar()) | ||
{ | ||
unsq_zp_a0 = | ||
info.add_instruction(make_op("unsqueeze", {{"axes", {0}}}), unsq_zp_a0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: Can these unsqueeze operators be merged to either be unsqueeze {{axes, 0}}
and unsqueeze {{axes, 0, 1}
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think so here as one is assume the input is scalar and needs the additional added 1 dimension
Added |
Modified input checks to ensure we check last dimension correclty for multi input in matmulintegertofloat Validated output using numpy for this.
6b3d516
to
6e2a36c
Compare
Check results before merge 🔆 |
🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output |
Changes to MatMul parser to handle the Microsoft Contrib operator MatMulintegarToFloat
Since we have the scale and zero points in our operands we can just perform a multiplied after int8 biases are added and then insert a regular dot on the scaled input values which should give the same output as the input data types.
Able to leverage the existing set of tests for matmul
Needs #3526 as there's a bug with dequantizelinear this has uncovered