[SDPA][Nested Tensor] Bump grad_query
fudge factor for small GPUs (…
#343
Job | Run time |
---|---|
1s | |
1s |
grad_query
fudge factor for small GPUs (…
#343
Job | Run time |
---|---|
1s | |
1s |