Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing cl_khr_fp16 #264

Closed
LWisteria opened this issue Feb 4, 2020 · 5 comments
Closed

Removing cl_khr_fp16 #264

LWisteria opened this issue Feb 4, 2020 · 5 comments
Milestone

Comments

@LWisteria
Copy link
Member

As of #224 and #261, the current solution for "half" needs cl_khr_fp16.

However, it must not need it theoretically and could be removed.

@LWisteria
Copy link
Member Author

It was inserted on ac3adc9 to solve #131 .

@vorj @nsakabe-fixstars Can you remember and explain the reason?
I remember the reason is just to avoid the problem that sqrt of int8 and int16 is stored as the half format.
It is not enough to prove its necessity because only storing half format uses no half operation which needs to set cl_khr_fp16.

@vorj
Copy link

vorj commented Feb 4, 2020

int8 and int16

No, problematic types are int8 , uint8 , and bool . Some functions have only floating type(s) for input/output , then the types are converted to half .
int16 is converted to float , so it has no problem.
The problematic functions list has been shown as test cases that are failed when we disable cl_khr_fp16 , by @nsakabe-fixstars.

It is not enough to prove its necessity because only storing half format uses no half operation which needs to set cl_khr_fp16.

I really think so, but NVIDIA doesn't .

@LWisteria
Copy link
Member Author

LWisteria commented Feb 4, 2020

@vorj Thanks.

Your previous comment

In OpenCL C on NVIDIA GPU, enabling cl_khr_fp16 is required to copy a half value.

must be checked on the current titanv (CUDA 10).

Furthermore, to solve the problem with only the copy operation, for example, we can treat fp16 as int16. I remember there were some reasons to disturb such solutions but I don't remember what they actually were.

Anyway, I don't think that investigation for the solution to avoid it is perfect. The priority to pass the tests was higher than enough investigation at that time.
Now it's time to try it...!

@vorj
Copy link

vorj commented Feb 4, 2020

Your previous comment

In OpenCL C on NVIDIA GPU, enabling cl_khr_fp16 is required to copy a half value.

must be checked on the current titanv (CUDA 10).

Please see the secondary machine error log for #260 .
I've shown the extracts of the errors below:

warning: cannot enable cl_khr_fp_16 extension on this platform - ignoring.
error: declaring function return value of type 'half' is not allowed; did you forget * ?
error: loading directly from pointer to type 'const __attribute__((address_space(16776963))) half' is not allowed

It seems that NVIDIA doesn't allowed to copy half value yet (cl_khr_fp16 enable is just ignored). Originally we use cl_khr_fp16 only to enable copying and we've never added half arithmetic operations, so it's obvious. Of course, we can pass the test cases without enabling cl_khr_fp16 on the primary machine.

@vorj
Copy link

vorj commented Mar 2, 2020

#269 fixed this issue, so I close it.

@vorj vorj closed this as completed Mar 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants