Default Branch

d2ecd2d1cf · common/parser: add --skip-chat-parsing to force a pure content parser. (#20289) · Updated 2026-03-17 15:16:43 +00:00

Branches

e6dbc81569 · metal : cap threadgroups size of set_rows · Updated 2025-11-10 14:17:09 +00:00    sdgoij

1387
1

3ad533689c · ggml : remove KQ mask padding · Updated 2025-11-10 12:35:25 +00:00    sdgoij

1389
1

2ef41855cf · convert : for FP8, use scale type to decide auto type · Updated 2025-11-07 03:55:53 +00:00    sdgoij

1427
16

e996f3aef8 · convert : fix no-lazy dtypes from direct safetensors · Updated 2025-11-07 03:33:09 +00:00    sdgoij

1427
3

128118fdbe · convert : use F32 for dequant of pack-quantized tensors · Updated 2025-11-07 02:59:32 +00:00    sdgoij

1427
6

23b70f4f70 · Initial plan · Updated 2025-11-04 11:00:12 +00:00    sdgoij

1455
1

79b98dbf96 · Merge branch 'master' into xsn/mtmd_custom_min_max_tokens · Updated 2025-11-02 21:14:03 +00:00    sdgoij

1470
2

d441c31b19 · metal : remove stray return · Updated 2025-11-02 16:24:00 +00:00    sdgoij

1479
9

d7f794eadb · convert : avoid dequantizing mxfp4 for GPT-OSS · Updated 2025-10-24 11:56:26 +00:00    sdgoij

1566
1

93fbd407f3 · Merge branch 'master' into compilade/convert-prequant · Updated 2025-10-23 18:23:12 +00:00    sdgoij

1569
6

f0076dc5a0 · metal : adjust .get_alloc_size to be alloc friendly · Updated 2025-10-19 14:20:54 +00:00    sdgoij

1599
1

96f9f391c7 · ggml : fix unaligned access in AMX code · Updated 2025-09-29 07:37:15 +00:00    sdgoij

1779
1

a8b0089a5b · ggml : remove SVE paths · Updated 2025-09-28 17:26:03 +00:00    sdgoij

1779
1

837b1b4563 · ggml : remove KQ mask padding · Updated 2025-09-28 15:10:17 +00:00    sdgoij

1782
6

17ca6ed540 · Implement llama-pull tool · Updated 2025-09-20 16:25:21 +00:00    sdgoij

1870
1

e83ef74733 · one less magic number · Updated 2025-09-20 05:58:36 +00:00    sdgoij

1889
6

652d303b32 · metal : fuse add + rms · Updated 2025-09-18 13:29:25 +00:00    sdgoij

1887
1

64c6dcbe6d · metal : make the NSG a function constant in mul_mv kernels · Updated 2025-09-18 08:31:59 +00:00    sdgoij

1892
2

6045c5a263 · cont : put all buffers in the same virtual address space · Updated 2025-09-14 12:46:57 +00:00    sdgoij

1928
2

3f62ee8bee · metal : back to a single queue per device · Updated 2025-09-09 14:06:46 +00:00    sdgoij

1970
9