mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2026-05-10 11:04:06 +00:00
When computing sinks, the cm1 shader was looping r from 0 to Br rather than to rows_per_thread. I must have copied this from the scalar path (where it is correct), and somehow it wasn't causing failures on current drivers.