CUDA: fixed mmvq kernel for bs 2,3,4 and -sm row (#5386) #5
Job | Run time |
---|---|
2m 29s | |
2m 28s | |
2m 15s | |
2m 28s | |
2m 28s | |
2m 28s | |
2m 35s | |
2m 29s | |
2m 10s | |
2m 36s | |
2m 29s | |
26m 55s |
Job | Run time |
---|---|
2m 29s | |
2m 28s | |
2m 15s | |
2m 28s | |
2m 28s | |
2m 28s | |
2m 35s | |
2m 29s | |
2m 10s | |
2m 36s | |
2m 29s | |
26m 55s |