llama.cpp/tests/test-backend-ops.cpp at 1213a035643f30cc6941c463501331243deb4968

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-05-14 21:14:10 +00:00

Files

Aman Gupta 9f682fb640 ggml-cpu: FA split across kv for faster TG (#19209 )

* ggml-cpu: split across kv for faster TG

* simplify sinks application

* add ref impl

2026-02-03 01:19:55 +08:00

View Raw