Parser changes to handle MatMulIntegerToFloat #3445

TedThemistokleous · 2024-09-16T02:48:25Z

Changes to MatMul parser to handle the Microsoft Contrib operator MatMulintegarToFloat

Since we have the scale and zero points in our operands we can just perform a multiplied after int8 biases are added and then insert a regular dot on the scaled input values which should give the same output as the input data types.

Able to leverage the existing set of tests for matmul

~~Needs #3526 as there's a bug with dequantizelinear this has uncovered~~

TedThemistokleous · 2024-09-16T02:49:32Z

TODO:

Add Parser tests for err cases
Add parser tests for base case
Add parser test for bias and zero point cases
Add verify tests for all of the above

codecov · 2024-09-16T20:56:55Z

Codecov Report

Attention: Patch coverage is 87.50000% with 11 lines in your changes missing coverage. Please review.

Project coverage is 92.21%. Comparing base (2e59073) to head (13063df).

Files with missing lines	Patch %	Lines
src/onnx/parse_matmul.cpp	87.50%	11 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #3445      +/-   ##
===========================================
- Coverage    92.23%   92.21%   -0.02%     
===========================================
  Files          514      514              
  Lines        21746    21810      +64     
===========================================
+ Hits         20057    20113      +56     
- Misses        1689     1697       +8

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Updated parser to handle bias case as well as bad scale conditions Initial float/half tests bad scale tests bad bias tests

avoid tidy screaming about complexity

Use dequantizelinear which elminates the need to add in shifts due to int8/uint8 mismatches still needs parser tests

Needed to update initial tests and gen onnx

src/onnx/parse_matmul.cpp

lakhinderwalia · 2024-10-17T23:16:59Z

src/onnx/parse_matmul.cpp

+                MIGRAPHX_THROW("PARSE_QUANT_DOT_SCALED: Bias have same dim as matrix B column");
+            }
+
+            has_valid_scale_bias = true;


As against invalid? ;-)

If scale bias doesn't exist there isn't a bias at the end of the matmulintergertofloat added then.

I was simply wondering if has_scale_bias isn't what the intent is? :-)

src/onnx/parse_matmul.cpp

lakhinderwalia · 2024-10-17T23:21:51Z

src/onnx/parse_matmul.cpp

+        return dequantized_op;
+    }
+
+    static instruction_ref handle_scaled_output(const onnx_parser::node_info& info,


Too many parameters. Ideally they should be handled by a struct parameter.

They're the same amount of a parameters gathered by the operator. These are all needed for dequantize steps and adding the proper unsqueeze->transpose paths. Order matters here with respect to matrix input A or B

src/onnx/parse_matmul.cpp

Use the parsed in op name for error messages to help logging should parser errors occur.

Change naming to be agnostic of input index.

src/onnx/parse_matmul.cpp

lakhinderwalia · 2024-11-01T03:48:10Z

src/onnx/parse_matmul.cpp

+        bool a1_has_no_zp = (a1 == zp_a1);
+
+        auto unsq_scale_a0 = info.add_instruction(make_op("unsqueeze", {{"axes", {-1}}}), scale_a0);
+        if(not a0_has_no_zp)


(Nit) Style: perhaps two negatives are not required, if there is a variable like a0_has_zp.

src/onnx/parse_matmul.cpp

Clean up uint8 handling for quant_dot. Fix tests

lakhinderwalia

Thank you for following up the comments. Approved.

CharlieL7

From our conversation, need to test/handle higher dimensional matrix contractions (matrix mul). Also transpose with permutation = {0, 1} probably does nothing.

Remove the transpose ops here as this was masking an issue related to input checks for the scale and bias inputs. Supposed to match input for the scale value based on the input column value instead of the row. With this in mind we can remove the transpose here. Updated parser tests to handle this correclty. Retested and validated output since transpose changes the math here.

TedThemistokleous · 2024-12-07T21:20:59Z

From our conversation, need to test/handle higher dimensional matrix contractions (matrix mul). Also transpose with permutation = {0, 1} probably does nothing.

You're right but I think when I tried this wasn't doing what I thought it was in conjunction with some the squeeze at -1 . Fixed this an realized I don't need the transpose here.

Also I think this should handle N-dim now since everything is broadcasted on a per column basis for scale/bias inputs which should be matched to the column of the input matrix.

CharlieL7

Minor: would be good to have a higher ndim test

CharlieL7 · 2024-12-10T18:18:57Z

src/onnx/parse_matmul.cpp

+            unsq_zp_a0 = info.add_instruction(make_op("unsqueeze", {{"axes", {0}}}), zp_a0);
+            if(zp_a0->get_shape().scalar())
+            {
+                unsq_zp_a0 =
+                    info.add_instruction(make_op("unsqueeze", {{"axes", {0}}}), unsq_zp_a0);


Minor: Can these unsqueeze operators be merged to either be unsqueeze {{axes, 0}} and unsqueeze {{axes, 0, 1}?

I dont think so here as one is assume the input is scalar and needs the additional added 1 dimension

TedThemistokleous · 2024-12-10T23:13:06Z

Minor: would be good to have a higher ndim test

Added

Modified input checks to ensure we check last dimension correclty for multi input in matmulintegertofloat Validated output using numpy for this.

migraphx-bot · 2024-12-11T09:49:05Z

Test	Batch	Rate new 13063d	Rate old 64fe0c	Diff	Compare
torchvision-resnet50	64	3,254.74	3,254.94	-0.01%	✅
torchvision-resnet50_fp16	64	6,970.71	6,977.96	-0.10%	✅
torchvision-densenet121	32	2,435.00	2,436.55	-0.06%	✅
torchvision-densenet121_fp16	32	4,077.51	4,076.26	0.03%	✅
torchvision-inceptionv3	32	1,628.16	1,627.46	0.04%	✅
torchvision-inceptionv3_fp16	32	2,742.57	2,741.58	0.04%	✅
cadene-inceptionv4	16	765.25	764.31	0.12%	✅
cadene-resnext64x4	16	813.34	813.14	0.02%	✅
slim-mobilenet	64	7,466.12	7,466.83	-0.01%	✅
slim-nasnetalarge	64	209.02	209.03	-0.00%	✅
slim-resnet50v2	64	3,440.77	3,443.32	-0.07%	✅
bert-mrpc-onnx	8	1,147.21	1,144.17	0.27%	✅
bert-mrpc-tf	1	484.14	474.21	2.09%	✅
pytorch-examples-wlang-gru	1	422.01	416.53	1.32%	✅
pytorch-examples-wlang-lstm	1	387.79	384.23	0.93%	✅
torchvision-resnet50_1	1	769.92	783.29	-1.71%	✅
cadene-dpn92_1	1	399.09	398.94	0.04%	✅
cadene-resnext101_1	1	382.07	383.46	-0.36%	✅
onnx-taau-downsample	1	345.93	345.52	0.12%	✅
dlrm-criteoterabyte	1	33.31	33.33	-0.06%	✅
dlrm-criteoterabyte_fp16	1	52.74	52.73	0.02%	✅
agentmodel	1	8,123.54	8,127.83	-0.05%	✅
unet_fp16	2	58.78	58.89	-0.19%	✅
resnet50v1_fp16	1	929.72	938.63	-0.95%	✅
resnet50v1_int8	1	1,015.77	984.73	3.15%	🔆
bert_base_cased_fp16	64	1,169.67	1,170.23	-0.05%	✅
bert_large_uncased_fp16	32	363.04	362.94	0.03%	✅
bert_large_fp16	1	198.06	200.28	-1.11%	✅
distilgpt2_fp16	16	2,200.80	2,198.50	0.10%	✅
yolov5s	1	522.19	531.33	-1.72%	✅
tinyllama	1	43.40	43.34	0.12%	✅
vicuna-fastchat	1	181.22	172.03	5.34%	🔆
whisper-tiny-encoder	1	418.05	418.00	0.01%	✅
whisper-tiny-decoder	1	428.62	428.83	-0.05%	✅

Check results before merge 🔆

migraphx-bot · 2024-12-11T09:49:07Z

✅ bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

✅ bert-mrpc-tf: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

✅ torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

✅ cadene-dpn92_1: PASSED: MIGraphX meets tolerance

✅ cadene-resnext101_1: PASSED: MIGraphX meets tolerance

✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

✅ agentmodel: PASSED: MIGraphX meets tolerance

✅ unet: PASSED: MIGraphX meets tolerance

✅ resnet50v1: PASSED: MIGraphX meets tolerance

✅ bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

✅ bert_large: PASSED: MIGraphX meets tolerance

✅ yolov5s: PASSED: MIGraphX meets tolerance

✅ tinyllama: PASSED: MIGraphX meets tolerance

✅ vicuna-fastchat: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

✅ distilgpt2_fp16: PASSED: MIGraphX meets tolerance

TedThemistokleous added 2 commits September 16, 2024 02:31

Initial commit of parser changes to handle MatMulIntegerToFloat

b19ce16

Update output to handle dot and broadcasted instead of mul

ae9f722

TedThemistokleous self-assigned this Sep 16, 2024

TedThemistokleous added onnxruntime PR changes interaction between MIGraphX and Onnxruntime Onnx Operators Adding or modifying an Onnx Operator in the MIGraphX codebase UAI labels Sep 16, 2024

Fix typo

7f62a33

TedThemistokleous added 2 commits September 17, 2024 03:30

Add parser tests and updated gen onnx

92d8ea4

Updated parser to handle bias case as well as bad scale conditions Initial float/half tests bad scale tests bad bias tests

Handle scaled output result better

cdb307d

avoid tidy screaming about complexity

TedThemistokleous force-pushed the add_matmulintegertofloat_contrib_op branch from 74f8ae0 to cdb307d Compare September 17, 2024 15:48

Merge branch 'develop' into add_matmulintegertofloat_contrib_op

f912e61

TedThemistokleous mentioned this pull request Oct 11, 2024

Fix dequantizelinaer simple case with no zero point present #3526

Closed

TedThemistokleous and others added 2 commits October 11, 2024 17:45

Merge branch 'develop' into add_matmulintegertofloat_contrib_op

02c3918

Update fixes for parser tests

3ca3e6a

Use dequantizelinear which elminates the need to add in shifts due to int8/uint8 mismatches still needs parser tests

TedThemistokleous requested review from lakhinderwalia and shivadbhavsar October 11, 2024 23:26

TedThemistokleous marked this pull request as ready for review October 11, 2024 23:26

TedThemistokleous requested a review from causten as a code owner October 11, 2024 23:26

TedThemistokleous added 4 commits October 16, 2024 02:46

Update parser to use dequantizelinear

547826d

Add test case for zero point in matmulintegertofloat

c6d8679

Needed to update initial tests and gen onnx

Add parser tests for scalar inputs in matmulintegertofloat

6d89fdd

Add parser for bias with zero points

c0c8120

lakhinderwalia reviewed Oct 17, 2024

View reviewed changes

TedThemistokleous added 4 commits October 29, 2024 15:37

Use op_name for parser error messages

719f4aa

Use the parsed in op name for error messages to help logging should parser errors occur.

Rename cars in handle_scaled_transpose

bd9fafd

Change naming to be agnostic of input index.

Simplify logic with is_dot

2f42d13

Add verify test for all inputs and valid datatypes

9660e11

TedThemistokleous force-pushed the add_matmulintegertofloat_contrib_op branch from 42b787d to 9660e11 Compare October 31, 2024 22:00

TedThemistokleous requested review from lakhinderwalia and CharlieL7 October 31, 2024 22:01

lakhinderwalia reviewed Nov 1, 2024

View reviewed changes

Ted Themistokleous added 2 commits November 7, 2024 14:03

handle review comments

84d850b

Simplify quant_dot section and tests

be3444a

Clean up uint8 handling for quant_dot. Fix tests

TedThemistokleous requested a review from lakhinderwalia November 7, 2024 15:21

Ted Themistokleous and others added 2 commits November 9, 2024 02:39

Fix format

9aeb6d4

Merge branch 'develop' into add_matmulintegertofloat_contrib_op

8a41f16

lakhinderwalia approved these changes Nov 9, 2024

View reviewed changes

CharlieL7 requested changes Nov 11, 2024

View reviewed changes

TedThemistokleous requested a review from CharlieL7 December 7, 2024 21:16

TedThemistokleous and others added 3 commits December 7, 2024 16:21

Merge branch 'develop' into add_matmulintegertofloat_contrib_op

1fc6732

Add additional test cases for scales

0f2680b

Fix format

7e08301

CharlieL7 approved these changes Dec 10, 2024

View reviewed changes

Add test case for multi dimensions (N >2 ) for matrix inputs.

6e2a36c

Modified input checks to ensure we check last dimension correclty for multi input in matmulintegertofloat Validated output using numpy for this.

TedThemistokleous force-pushed the add_matmulintegertofloat_contrib_op branch from 6b3d516 to 6e2a36c Compare December 11, 2024 03:44

Merge branch 'develop' into add_matmulintegertofloat_contrib_op

13063df

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parser changes to handle MatMulIntegerToFloat #3445

Parser changes to handle MatMulIntegerToFloat #3445

TedThemistokleous commented Sep 16, 2024 •

edited

Loading

TedThemistokleous commented Sep 16, 2024 •

edited

Loading

codecov bot commented Sep 16, 2024 •

edited

Loading

lakhinderwalia Oct 17, 2024

TedThemistokleous Oct 18, 2024

lakhinderwalia Nov 8, 2024

lakhinderwalia Oct 17, 2024

TedThemistokleous Oct 18, 2024

lakhinderwalia Nov 1, 2024

lakhinderwalia left a comment

CharlieL7 left a comment

TedThemistokleous commented Dec 7, 2024

CharlieL7 left a comment

CharlieL7 Dec 10, 2024

TedThemistokleous Dec 10, 2024

TedThemistokleous commented Dec 10, 2024

migraphx-bot commented Dec 11, 2024

migraphx-bot commented Dec 11, 2024

Parser changes to handle MatMulIntegerToFloat #3445

Are you sure you want to change the base?

Parser changes to handle MatMulIntegerToFloat #3445

Conversation

TedThemistokleous commented Sep 16, 2024 • edited Loading

TedThemistokleous commented Sep 16, 2024 • edited Loading

codecov bot commented Sep 16, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lakhinderwalia left a comment

Choose a reason for hiding this comment

CharlieL7 left a comment

Choose a reason for hiding this comment

TedThemistokleous commented Dec 7, 2024

CharlieL7 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TedThemistokleous commented Dec 10, 2024

migraphx-bot commented Dec 11, 2024

migraphx-bot commented Dec 11, 2024

TedThemistokleous commented Sep 16, 2024 •

edited

Loading

TedThemistokleous commented Sep 16, 2024 •

edited

Loading

codecov bot commented Sep 16, 2024 •

edited

Loading