llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-05-10 02:54:06 +00:00

Files

Neo Zhang 213c4a0b81 [SYCL] supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (#20190 )

* support flash-attention for fp32/fp16/Q4/Q5/Q8

* rm warining

* update for JIT

2026-03-08 12:00:07 +08:00

2025-08-07 13:45:41 +02:00

2026-03-07 15:41:10 +08:00

2026-03-08 12:00:07 +08:00

.gitignore

2024-07-13 18:12:39 +02:00

CMakeLists.txt

2026-02-15 22:24:29 +02:00