llama.cpp/tools/server/server-task.cpp at 7668999518dee4822e3b31a33c2c8d5982a3c9db

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-05-13 04:24:17 +00:00

Files

Oliver Simons 7668999518 Merge branch 'master' into gpu-sampling

Let's keep `master's` cumsum implementation for it's likely better AMD
perf and add back pure-CUB-implementation in follow-up commit

2025-12-05 14:41:08 +01:00

View Raw