RotaryEmbedding Contrib OP #3695

TedThemistokleous · 2024-12-09T15:06:09Z

Add the Contrib OP for RotaryEmbedding which is a Microsoft Contrib OP

Able to reuse the GPU kernel we have in GroupQuerryAttention and then use a new parser to handle this correctly

codecov · 2024-12-09T15:20:53Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.23%. Comparing base (4b15b6c) to head (fa58c0d).
Report is 4 commits behind head on develop.

Additional details and impacted files

@@           Coverage Diff            @@
##           develop    #3695   +/-   ##
========================================
  Coverage    92.23%   92.23%           
========================================
  Files          514      514           
  Lines        21746    21746           
========================================
  Hits         20057    20057           
  Misses        1689     1689

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

migraphx-bot · 2024-12-09T17:27:46Z

Test	Batch	Rate new fa58c0	Rate old a30b25	Diff	Compare
torchvision-resnet50	64	3,254.53	3,254.31	0.01%	✅
torchvision-resnet50_fp16	64	6,994.00	6,986.93	0.10%	✅
torchvision-densenet121	32	2,428.14	2,432.26	-0.17%	✅
torchvision-densenet121_fp16	32	4,097.54	4,102.15	-0.11%	✅
torchvision-inceptionv3	32	1,626.88	1,627.22	-0.02%	✅
torchvision-inceptionv3_fp16	32	2,743.63	2,744.59	-0.04%	✅
cadene-inceptionv4	16	765.01	764.44	0.07%	✅
cadene-resnext64x4	16	813.74	812.75	0.12%	✅
slim-mobilenet	64	7,389.98	7,463.39	-0.98%	✅
slim-nasnetalarge	64	208.96	208.94	0.01%	✅
slim-resnet50v2	64	3,439.86	3,440.19	-0.01%	✅
bert-mrpc-onnx	8	1,148.98	1,143.08	0.52%	✅
bert-mrpc-tf	1	474.15	469.87	0.91%	✅
pytorch-examples-wlang-gru	1	426.45	513.89	-17.02%	🔴
pytorch-examples-wlang-lstm	1	482.80	386.92	24.78%	🔆
torchvision-resnet50_1	1	775.57	776.87	-0.17%	✅
cadene-dpn92_1	1	400.25	395.29	1.25%	✅
cadene-resnext101_1	1	382.89	373.27	2.58%	✅
onnx-taau-downsample	1	345.83	345.34	0.14%	✅
dlrm-criteoterabyte	1	33.33	33.31	0.06%	✅
dlrm-criteoterabyte_fp16	1	52.75	52.72	0.05%	✅
agentmodel	1	8,179.42	8,185.78	-0.08%	✅
unet_fp16	2	58.90	58.75	0.26%	✅
resnet50v1_fp16	1	932.39	945.03	-1.34%	✅
resnet50v1_int8	1	1,005.18	987.38	1.80%	✅
bert_base_cased_fp16	64	1,170.83	1,169.56	0.11%	✅
bert_large_uncased_fp16	32	363.43	363.05	0.11%	✅
bert_large_fp16	1	198.42	198.54	-0.06%	✅
distilgpt2_fp16	16	2,200.56	2,199.18	0.06%	✅
yolov5s	1	535.04	534.87	0.03%	✅
tinyllama	1	43.63	43.39	0.55%	✅
vicuna-fastchat	1	172.42	175.67	-1.85%	✅
whisper-tiny-encoder	1	418.17	417.72	0.11%	✅
whisper-tiny-decoder	1	425.03	428.36	-0.78%	✅

This build is not recommended to merge 🔴

migraphx-bot · 2024-12-09T17:27:47Z

✅ bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

✅ bert-mrpc-tf: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

✅ torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

✅ cadene-dpn92_1: PASSED: MIGraphX meets tolerance

✅ cadene-resnext101_1: PASSED: MIGraphX meets tolerance

✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

✅ agentmodel: PASSED: MIGraphX meets tolerance

✅ unet: PASSED: MIGraphX meets tolerance

✅ resnet50v1: PASSED: MIGraphX meets tolerance

✅ bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

✅ bert_large: PASSED: MIGraphX meets tolerance

✅ yolov5s: PASSED: MIGraphX meets tolerance

✅ tinyllama: PASSED: MIGraphX meets tolerance

✅ vicuna-fastchat: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

✅ distilgpt2_fp16: PASSED: MIGraphX meets tolerance

pfultz2 · 2024-12-09T18:49:17Z

We really should remove this GPU kernel. It looks like this can already be implemented with the operators we have already.

TedThemistokleous · 2024-12-10T15:49:53Z

So don't reuse what we've done here?

initial changes to lowering to reuse rotatary embedding kernel for op

fa58c0d

TedThemistokleous added roadmap Tasks to finish for a release Onnx Operators Adding or modifying an Onnx Operator in the MIGraphX codebase labels Dec 9, 2024

TedThemistokleous self-assigned this Dec 9, 2024

TedThemistokleous changed the title ~~initial changes to lowering to reuse rotatary embedding kernel for op~~ RotaryEmbedding Contrib OP Dec 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RotaryEmbedding Contrib OP #3695

RotaryEmbedding Contrib OP #3695

TedThemistokleous commented Dec 9, 2024 •

edited

Loading

codecov bot commented Dec 9, 2024

migraphx-bot commented Dec 9, 2024

migraphx-bot commented Dec 9, 2024

pfultz2 commented Dec 9, 2024

TedThemistokleous commented Dec 10, 2024

RotaryEmbedding Contrib OP #3695

Are you sure you want to change the base?

RotaryEmbedding Contrib OP #3695

Conversation

TedThemistokleous commented Dec 9, 2024 • edited Loading

codecov bot commented Dec 9, 2024

Codecov Report

migraphx-bot commented Dec 9, 2024

migraphx-bot commented Dec 9, 2024

pfultz2 commented Dec 9, 2024

TedThemistokleous commented Dec 10, 2024

TedThemistokleous commented Dec 9, 2024 •

edited

Loading