-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Blas] copy functionality for signed int8 data type #2834
base: main
Are you sure you want to change the base?
Conversation
e6497ac
to
84f9140
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, LGTM
/** | ||
* @brief copy function with neon: Y = X | ||
* @param[in] N number of elements in X | ||
* @param[in] X uint8_t * for Vector X | ||
* @param[in] Y uint8_t * for Vector Y | ||
*/ | ||
void copy_int8_or_int4(const unsigned int N, const uint8_t *X, uint8_t *Y); | ||
|
||
/** | ||
* @brief copy function with neon: Y = X | ||
* @param[in] N number of elements in X | ||
* @param[in] X int8_t * for Vector X | ||
* @param[in] Y int8_t * for Vector Y | ||
*/ | ||
void copy_int8(const unsigned int N, const int8_t *X, int8_t *Y); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't aware of signed/unsigned int8 case when implementing copy_int8_or_int4..
How about using copy_s8 / copy_u8 for better understanding?
Or we can specify copy_int8_or_int4
like copy_uint8_or_int4
to discriminate them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy_s8 and copy_u8 make more sense :) Let's create a new PR to rename the functions!
* @param[in] Y int8_t * for Vector Y | ||
*/ | ||
void scopy(const unsigned int N, const int8_t *X, const int incX, int8_t *Y, | ||
const int intY); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const int intY); | |
const int incY); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see there are several misspelled "intY/incY"s..
My apologies :(
nntrainer/tensor/blas_interface.cpp
Outdated
nntrainer::neon::copy_int8_to_fp16(N, X, Y); | ||
} else { | ||
throw std::invalid_argument( | ||
"Error: incX == 1 && incY == 1 is supported only"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not
for (unsigned int idx = 0; idx < N; idx++) {
Y[idx] = X[idx];
}
(with performance warning...) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe incX and incY indicate x increment and y increment. Would having an increment greater than 1 for a copy be valid? @skykongkong8
for (unsigned int idx = 0; idx < N; idx++) {
Y[idx * incY] = X[idx * incX];
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the suggested code is not valid, you need to take care of LINE 296-298.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like copying floating points considers incX and incY.
nntrainer/nntrainer/tensor/blas_interface.cpp
Lines 662 to 669 in 18635c2
static void __scopy_fallback(const unsigned int N, const float *X, | |
const int incX, float *Y, const int incY) { | |
unsigned int incy = abs(incY); | |
unsigned int incx = abs(incX); | |
for (unsigned int i = 0; i < N; ++i) | |
Y[i * incy] = X[i * incx]; | |
} |
84f9140
to
accfd36
Compare
This pull request aims at adding the functionality to copy the int8 data type into other types such as int8, fp16, and fp32. Please note that this implementation follows the intrinsic used for copying uint8 values. By including this feature, we can expect more flexibility in handling different data types which will contribute to overall system performance improvement. **Self-evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Donghyeon Jeong <[email protected]>
accfd36
to
ccf0906
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This pull request aims at adding the functionality to copy the int8 data type into other types such as int8, fp16, and fp32.
Please note that this implementation follows the intrinsic used for copying uint8 values.
By including this feature, we can expect more flexibility in handling different data types which will contribute to overall system performance improvement.
Self-evaluation: