[Blas] copy functionality for signed int8 data type #2834

djeong20 · 2024-12-18T04:32:31Z

This pull request aims at adding the functionality to copy the int8 data type into other types such as int8, fp16, and fp32.
Please note that this implementation follows the intrinsic used for copying uint8 values.
By including this feature, we can expect more flexibility in handling different data types which will contribute to overall system performance improvement.

Self-evaluation:

Build test: [X]Passed [ ]Failed [ ]Skipped
Run test: [X]Passed [ ]Failed [ ]Skipped

skykongkong8

Overall, LGTM

skykongkong8 · 2024-12-18T05:06:41Z

nntrainer/tensor/blas_neon.h

 /**
 * @brief     copy function with neon: Y = X
 * @param[in] N number of elements in X
 * @param[in] X uint8_t * for Vector X
 * @param[in] Y uint8_t * for Vector Y
 */
 void copy_int8_or_int4(const unsigned int N, const uint8_t *X, uint8_t *Y);
+
+/**
+ * @brief     copy function with neon: Y = X
+ * @param[in] N number of elements in X
+ * @param[in] X int8_t * for Vector X
+ * @param[in] Y int8_t * for Vector Y
+ */
+void copy_int8(const unsigned int N, const int8_t *X, int8_t *Y);


I wasn't aware of signed/unsigned int8 case when implementing copy_int8_or_int4..
How about using copy_s8 / copy_u8 for better understanding?
Or we can specify copy_int8_or_int4 like copy_uint8_or_int4 to discriminate them

copy_s8 and copy_u8 make more sense :) Let's create a new PR to rename the functions!

skykongkong8 · 2024-12-18T05:07:44Z

nntrainer/tensor/blas_interface.h

+ * @param[in] Y int8_t * for Vector Y
+ */
+void scopy(const unsigned int N, const int8_t *X, const int incX, int8_t *Y,
+           const int intY);


Suggested change

const int intY);

const int incY);

I see there are several misspelled "intY/incY"s..
My apologies :(

myungjoo · 2024-12-20T07:16:33Z

nntrainer/tensor/blas_interface.cpp

+    nntrainer::neon::copy_int8_to_fp16(N, X, Y);
+  } else {
+    throw std::invalid_argument(
+      "Error: incX == 1 && incY == 1 is supported only");


Why not

for (unsigned int idx = 0; idx < N; idx++) { Y[idx] = X[idx]; }

(with performance warning...) ?

I believe incX and incY indicate x increment and y increment. Would having an increment greater than 1 for a copy be valid? @skykongkong8

for (unsigned int idx = 0; idx < N; idx++) { Y[idx * incY] = X[idx * incX]; }

If the suggested code is not valid, you need to take care of LINE 296-298.

looks like copying floating points considers incX and incY.

nntrainer/nntrainer/tensor/blas_interface.cpp

Lines 662 to 669 in 18635c2

static void __scopy_fallback(const unsigned int N, const float *X,

const int incX, float *Y, const int incY) {

unsigned int incy = abs(incY);

unsigned int incx = abs(incX);

for (unsigned int i = 0; i < N; ++i)

Y[i * incy] = X[i * incx];

}

This pull request aims at adding the functionality to copy the int8 data type into other types such as int8, fp16, and fp32. Please note that this implementation follows the intrinsic used for copying uint8 values. By including this feature, we can expect more flexibility in handling different data types which will contribute to overall system performance improvement. **Self-evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Donghyeon Jeong <[email protected]>

EunjuYang

LGTM

djeong20 requested review from myungjoo, jijoongmoon, again4you, jaeyun-jung, leemgs, wooksong, gichan-jang, anyj0527, lhs8928, songgot, jihochu, DonghakPark, SeoHyungjun, baek2sm, skykongkong8 and EunjuYang as code owners December 18, 2024 04:32

github-actions bot added the Need Review label Dec 18, 2024

djeong20 force-pushed the backend/cpu/copy/int8 branch 2 times, most recently from e6497ac to 84f9140 Compare December 18, 2024 04:43

skykongkong8 approved these changes Dec 18, 2024

View reviewed changes

myungjoo reviewed Dec 20, 2024

View reviewed changes

djeong20 force-pushed the backend/cpu/copy/int8 branch from 84f9140 to accfd36 Compare December 23, 2024 08:13

djeong20 force-pushed the backend/cpu/copy/int8 branch from accfd36 to ccf0906 Compare December 23, 2024 08:41

EunjuYang approved these changes Dec 27, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Blas] copy functionality for signed int8 data type #2834

[Blas] copy functionality for signed int8 data type #2834

djeong20 commented Dec 18, 2024

skykongkong8 left a comment

skykongkong8 Dec 18, 2024 •

edited

Loading

djeong20 Dec 23, 2024

skykongkong8 Dec 18, 2024

skykongkong8 Dec 18, 2024

myungjoo Dec 20, 2024

djeong20 Dec 23, 2024

myungjoo Dec 23, 2024

djeong20 Dec 23, 2024

EunjuYang left a comment

	static void __scopy_fallback(const unsigned int N, const float *X,
	const int incX, float *Y, const int incY) {
	unsigned int incy = abs(incY);
	unsigned int incx = abs(incX);

	for (unsigned int i = 0; i < N; ++i)
	Y[i * incy] = X[i * incx];
	}

[Blas] copy functionality for signed int8 data type #2834

Are you sure you want to change the base?

[Blas] copy functionality for signed int8 data type #2834

Conversation

djeong20 commented Dec 18, 2024

skykongkong8 left a comment

Choose a reason for hiding this comment

skykongkong8 Dec 18, 2024 • edited Loading

Choose a reason for hiding this comment

djeong20 Dec 23, 2024

Choose a reason for hiding this comment

skykongkong8 Dec 18, 2024

Choose a reason for hiding this comment

skykongkong8 Dec 18, 2024

Choose a reason for hiding this comment

myungjoo Dec 20, 2024

Choose a reason for hiding this comment

djeong20 Dec 23, 2024

Choose a reason for hiding this comment

myungjoo Dec 23, 2024

Choose a reason for hiding this comment

djeong20 Dec 23, 2024

Choose a reason for hiding this comment

EunjuYang left a comment

Choose a reason for hiding this comment

skykongkong8 Dec 18, 2024 •

edited

Loading