Releases: ggerganov/llama.cpp
Releases · ggerganov/llama.cpp
b4067
vulkan: Throttle the number of shader compiles during the build step.…
b4066
metal : more precise Q*K in FA vec kernel (#10247)
b4065
server : enable KV cache defrag by default (#10233) ggml-ci
b4062
vulkan: Fix newly added tests for permuted mul_mat and 1D im2col (#10…
b4061
metal : reorder write loop in mul mat kernel + style (#10231) * metal : reorder write loop * metal : int -> short, style ggml-ci
b4060
metal : fix build and some more comments (#10229)
b4059
metal : fix F32 accumulation in FA vec kernel (#10232)
b4058
llama : fix Qwen model type strings
b4057
metal : hide debug messages from normal log
b4056
ggml: fix zero division in ‘dne’ calculation in CUDA COUNT_EQUAL oper…