-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize kernel launch using buffer protocol #281
Conversation
clpy/backend/function.pyx
Outdated
elif isinstance(a, bool): | ||
a = numpy.bool_(a) | ||
|
||
if core.numpy_scalar_type_set(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition, I fixed this bug?
(L103 is always True)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ybsh Is it okay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@y1r It seems a bug and has not found because there aren't any test cases passing illegal type value.
@nsakabe-fixstars What do you suggest to do? Adding test cases? Upstream CuPy might include such test cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@y1r As @LWisteria say, would you add a test / tests to cover these branches here? When you do it, please leave a note (such as NOTE(y1r):
) which leads to this discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nsakabe-fixstars Adding tests (and also fixing bug) can be for another issue/PR? Is it should added by us or CuPy upstream?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have two options:
- revert bugfix in this PR, add a new PR for bugfix and test, merge it into clpy, merge it into Optimize kernel launch using buffer protocol #281, merge Optimize kernel launch using buffer protocol #281 into clpy
- add a new CuPy's PR for a test, merge it into CuPy, cherry-pick its change
I think option 1 is enough for us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no test in CuPy for this situation. Therefore, pulling upstream cannot resolve this issue.
As a summary, there are three options:
- revert bugfix in this PR, add a new PR for bugfix and test, merge it into clpy, merge it into Optimize kernel launch using buffer protocol #281, merge Optimize kernel launch using buffer protocol #281 into clpy
- add test in this PR
- add a new CuPy's PR for a test, merge it into CuPy, cherry-pick its change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nsakabe-fixstars Which one is better for us?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@y1r OK, I see your situation. Please take the option 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test (commit 9c5e338) passed on vega.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test (commit 9c5e338) passed on titanv.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test (commit 9c5e338) passed on vega.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test (commit 9c5e338) passed on titanv.
@nsakabe-fixstars レビューお願いします. |
@y1r バグ修正した箇所についてコメントがあるのでご確認ください。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test (commit 80c149d) passed on vega.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test (commit 80c149d) passed on titanv.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test (commit 6cd51a2) passed on titanv.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test (commit 6cd51a2) passed on vega.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test (commit 4d61361) passed on titanv.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test (commit 4d61361) passed on vega.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
From the discussion on #153, I've tried optimizing code blocks 4 and 5 of the OpenCL Kernel launch.
I applied two optimizations:
Benchmark on train_mnist.py
Before applying above optimizations:
PR
For reference, CuPy version