llama.cpp/scripts/compare-llama-bench.py at acb7c790698fa28a0fbfc0468804926815b94de3

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-05-13 20:44:09 +00:00

Files

Aman Gupta 81ab64f3c8 ggml-cuda: enable cuda-graphs for n-cpu-moe (#18934 )

* ggml-cuda: add split-wise cuda graph

* add n-cpu-moe compare_llama_bench.py

* fix hip/musa builds

2026-01-24 14:25:20 +08:00

View Raw