llama.cpp/tests/test-backend-ops.cpp at 4fcd87cf7cbb131b3e28e121b29cc588e460eb40

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-05-13 04:24:17 +00:00

Files

Jeff Bolz 879d673759 vulkan: Implement top-k (#17418 )

* vulkan: Implement top-k

Each pass launches workgroups that each sort 2^N elements (where N is usually 7-10)
and discards all but the top K. Repeat until only K are left. And there's a fast
path when K==1 to just find the max value rather than sorting.

* fix pipeline selection

* vulkan: Add N-ary search algorithm for topk

* microoptimizations

2025-11-26 16:45:43 +01:00

321 KiB

Raw Blame History

View Raw

321 KiB Raw Blame History

321 KiB

Raw Blame History