This website requires JavaScript.
Explore
Help
Register
Sign In
sdgoij
/
llama.cpp
Watch
1
Star
0
Fork
0
You've already forked llama.cpp
mirror of
https://github.com/ggml-org/llama.cpp.git
synced
2026-05-08 18:14:07 +00:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
Files
b8933
llama.cpp
/
ggml
History
Johannes Gäßler
9725a313be
CUDA: reduce MMQ stream-k overhead (
#22298
)
...
* CUDA: reduce MMQ stream-k overhead * use 32 bit integers for kbc
2026-04-25 14:15:03 +02:00
..
cmake
ggml: backend-agnostic tensor parallelism (experimental) (
#19378
)
2026-04-09 16:42:19 +02:00
include
CUDA: manage NCCL communicators in context (
#21891
)
2026-04-15 15:58:40 +02:00
src
CUDA: reduce MMQ stream-k overhead (
#22298
)
2026-04-25 14:15:03 +02:00
.gitignore
vulkan : cmake integration (
#8119
)
2024-07-13 18:12:39 +02:00
CMakeLists.txt
HIP: flip GGML_HIP_GRAPHS to default on (
#22254
)
2026-04-23 02:34:31 +02:00