llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-05-12 12:04:08 +00:00

Files

Reese Levine c1258830b2 ggml webgpu: ops support for qwen3.5 (SET, TRI_SOLVE, SSM_CONV, GATED_DELTA_NET) + GET_ROWS optimization (#20687 )

* Implement l2_norm, set, tri

* Add DIAG/SOLVE_TRI

* Add SSM_CONV

* Better get_rows and gated_delta_net to support qwen3.5

* Clean up, update ops.md

* Fix binding_index type for wasm

* Fix read write annotations

* cleanups

2026-03-19 08:45:28 -07:00

wgsl-shaders

ggml webgpu: ops support for qwen3.5 (SET, TRI_SOLVE, SSM_CONV, GATED_DELTA_NET) + GET_ROWS optimization (#20687 )

2026-03-19 08:45:28 -07:00

CMakeLists.txt

ggml webgpu: add support for emscripten builds (#17184 )

2025-12-03 10:25:34 +01:00

ggml-webgpu-shader-lib.hpp

ggml webgpu: ops support for qwen3.5 (SET, TRI_SOLVE, SSM_CONV, GATED_DELTA_NET) + GET_ROWS optimization (#20687 )

2026-03-19 08:45:28 -07:00

ggml-webgpu.cpp

ggml webgpu: ops support for qwen3.5 (SET, TRI_SOLVE, SSM_CONV, GATED_DELTA_NET) + GET_ROWS optimization (#20687 )

2026-03-19 08:45:28 -07:00

pre_wgsl.hpp

ggml webgpu: initial flashattention implementation (#18610 )

2026-01-08 08:23:39 -08:00