llama.cpp/tests/test-backend-ops.cpp at a554bdd70f72d2199428db86d86ee67ea4b8aad2

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-05-14 13:04:08 +00:00

Files

Jeff Bolz 449ec2ab07 vulkan: Preprocess FA mask to detect all-neg-inf and all-zero. (#19281 )

Write out a 2-bit code per block and avoid loading the mask when it
matches these two common cases.

Apply this optimization when the mask is relatively large (i.e. prompt
processing).

2026-02-05 09:26:38 -06:00

345 KiB

Raw Blame History

View Raw

345 KiB Raw Blame History

345 KiB

Raw Blame History