iq1_s: Metal works, but quite slow

As usual, Apple Silicon does not like the code I write.
2026-05-13 12:34:05 +00:00 · 2024-02-13 14:37:16 +02:00
parent 020b548ec3
commit 425c6bbb6c
1 changed files with 1 additions and 1 deletions
--- a/ggml-metal.metal
+++ b/ggml-metal.metal
@@ -4399,7 +4399,7 @@ void kernel_mul_mv_iq1_s_f32_impl(
    for (int row = 0; row < N_DST; ++row) {
        all_sum = simd_sum(sumf[row]);
        if (tiisg == 0) {
-            dst[r1*ne0 + im*ne0*ne1 + first_row + row] = all_sum * 0.5f;
+            dst[r1*ne0 + im*ne0*ne1 + first_row + row] = all_sum;
        }
    }
 }