-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GPU/OpenCL] Moving Addition kernel to Tensor Directory @open sesame 07/04 09:01 #2666
[GPU/OpenCL] Moving Addition kernel to Tensor Directory @open sesame 07/04 09:01 #2666
Conversation
Moved addition_cl kernel to Tensor directory. Refactored addition_cl for generalization. Signed-off-by: Yash Singh <[email protected]>
📝 TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2666. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
cibot: @yashSingh0723, A builder checker could not be completed because one of the checkers is not completed. In order to find out a reason, please go to http://ci.nnstreamer.ai/nntrainer/ci/repo-workers/pr-checker/2666-202407031747250.62445497512817-d5e87f7d875fa2a04c296df140a1efb767a70e14/. |
* @param[in] result Tensor | ||
* @param[in] RunLayerContext reference | ||
*/ | ||
void add_i_cl(Tensor const &input, Tensor &result, RunLayerContext &context); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not a big deal but how about renaming it as add_cl()
for clarity since Tensor uses _i
as in-place operations. (which in this case takes an output tensor)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+) What about naming it as addCl
to make it consistent with other kernel operations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I was following the cpu naming convention, for cpu implementation it is named as add_i
. And for the GPU kernel as well I am doing the in-place operation. So inorder to add input[0]
and input[1]
, I am storing the result as input[0] += input[1]
, not creating any other output tensor additionally and treating input[0]
as output tensor. Followed the CPU implementation only.
But still if I need to change the name to addCL
, Please tell and I'll do it accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yashSingh0723, 💯 All CI checkers are successfully verified. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Moved
addition_cl
kernel to Tensor directory and refactored it for generalization.Naming convention changed for
fp16
inaddition_layer
unittests.