-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RotaryEmbedding Contrib OP #3695
base: develop
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #3695 +/- ##
========================================
Coverage 92.23% 92.23%
========================================
Files 514 514
Lines 21746 21746
========================================
Hits 20057 20057
Misses 1689 1689 ☔ View full report in Codecov by Sentry. |
This build is not recommended to merge 🔴 |
🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output |
We really should remove this GPU kernel. It looks like this can already be implemented with the operators we have already. |
So don't reuse what we've done here? |
Add the Contrib OP for RotaryEmbedding which is a Microsoft Contrib OP
Able to reuse the GPU kernel we have in GroupQuerryAttention and then use a new parser to handle this correctly