-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[gpu/blas] sgemv_noTrans kernel and unit tests @open sesame 10/29 21:54 #2770
Conversation
📝 TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2770. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/. |
nntrainer::TensorDim::TensorType t_type_nchw_fp16 = { | ||
nntrainer::Tformat::NCHW, nntrainer::Tdatatype::FP16}; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove? (this is fp32 TC so it is not needed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I wonder why there is no fp16 TCs as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed unused variables. Also we are focussing on accuracy as of now. Will add fp16 tests with later PRs.
Added kernel for sgemv_noTrans. Modified the conditions of blas_kernel_interface. Signed-off-by: Debadri Samaddar <[email protected]>
2381a17
to
3a19748
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@s-debadri, 💯 All CI checkers are successfully verified. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, LGTM :)
Please check the comments!
trans ? sgemv_cl(data, mdata, rdata, trans, dim2, dim1, lda) | ||
: sgemv_cl(data, mdata, rdata, trans, dim1, dim2, lda); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One suggestion!
Since trans
is now added as a parameter, handling dim1
and dim2
in the sgemv_cl
function would be possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trans ? sgemv_cl(data, mdata, rdata, trans, dim2, dim1, lda) | |
: sgemv_cl(data, mdata, rdata, trans, dim1, dim2, lda); | |
sgemv_cl(data, mdata, rdata, trans, dim2, dim1, lda) |
and
// sgemv_cl
size_t dim1_size = transA ? sizeof(float) * dim2 : sizeof(float) * dim1;
size_t dim2_size = transA ? sizeof(float) * dim1 : sizeof(float) * dim2;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The calls for N == 1
and M == 1
are a bit different. Also the work group size depends on the dim1
and dim2
. I think handling it before the function call would be better so that we can have a generalized function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, it is good.
Thank you for considering test cases as well as FP32 and FP16 cases in your work. 👍
@@ -32,13 +32,58 @@ static void setUpGpuContext() { | |||
ac.initBlasClKernels(); | |||
} | |||
|
|||
TEST(blas_kernels, dotCL_sgemv) { | |||
TEST(blas_kernels, dotCL_sgemv_M_1_1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more Question)
I brought this PR to my local environment for testing.
It seems that even when setting the enable-opencl
option to true, cannot reach the corresponding unit test.
Is it correct that the current test file is not included in any of the meson.build files within the unittest folder?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And when add
if get_option('enable-opencl')
test_target += [['unittest_blas_kernels_cl', []]]
endif
this to unittest/meson.build
i got this build error
/home/donghak/Desktop/workspace/nntrainer/nntrainer/opencl/opencl_program.cpp:73:10: error: variable length array ‘binaries_size’ is used [-Werror=vla]
73 | size_t binaries_size[num_devices];
| ^~~~~~~~~~~~~
/home/donghak/Desktop/workspace/nntrainer/nntrainer/opencl/opencl_program.cpp:98:8: error: variable length array ‘kernel_names’ is used [-Werror=vla]
98 | char kernel_names[kernel_names_size];
| ^~~~~~~~~~~~
/home/donghak/Desktop/workspace/nntrainer/nntrainer/opencl/opencl_program.cpp:113:18: error: variable length array ‘binaries_ptr’ is used [-Werror=vla]
113 | unsigned char *binaries_ptr[num_devices];
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error should be fixed to enable x86 OpenCL compilation (for tizen build as well). This error may not be detected in Android build. One can face this error, for compilation with PC environment doesn't support dynamic allocation like that. I will make an issue for it :
#2776
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unittest_blas_kernels_cl
was written for Android tests. So, it's only present in the Android.mk file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DonghakPark There was an error in my previous comment. The PR #2727 resolves this issue! But it needs to be rebased. I will update the issue as well. Sorry for confusion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand. Thank you for your explanations! 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@s-debadri, 💯 All CI checkers are successfully verified. Thanks.
Changes in this PR:
sgemv_noTrans
kernel.trans
information was added to the required functions.sgemv
andsgemv_noTrans
.Self evaluation:
Signed-off-by: Debadri Samaddar [email protected]