llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-05-06 09:04:07 +00:00

Files

Jeff Bolz c9ced4910b vulkan: preprocess mul_mat_id experts and discard workgroups more quickly (#18352 )

Run a preprocess to count how many times each expert is used, and use this to
quickly discard workgroups that aren't needed.

2025-12-26 16:12:58 -06:00

2025-08-07 13:45:41 +02:00

2025-12-15 09:24:59 +01:00

2025-12-26 16:12:58 -06:00

.gitignore

2024-07-13 18:12:39 +02:00

CMakeLists.txt

2025-12-19 09:42:28 -08:00