mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2026-03-17 16:44:07 +00:00
Created Feature matrix (markdown)
11
Feature-matrix.md
Normal file
11
Feature-matrix.md
Normal file
@@ -0,0 +1,11 @@
|
||||
# llama.cpp feature matrix
|
||||
|
||||
| | **CPU (AVX2)** | **CPU (ARM NEON)** | **Metal** | **cuBLAS** | **rocBLAS** | **SYCL** | **CLBlast** | **Vulkan** | **Kompute** |
|
||||
|:--------------------:|:--------------:|:------------------:|:---------:|:----------:|:----------------:|:--------:|:-----------:|:----------:|:-----------:|
|
||||
| **K-quants** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 🚫 |
|
||||
| **I-quants** | ✅ (SLOW) | ✅ (SLOW) | ✅ (SLOW) | ✅ | ✅ | Partial¹ | 🚫 | 🚫 | 🚫 |
|
||||
| **Multi-GPU** | N/A | N/A | N/A | ✅ | ❓ | 🚫 | ❓ | ✅ | ❓ |
|
||||
| **K cache quants** | ✅ | ❓ | ❓ | ✅ | Only q8_0 (SLOW) | ❓ | ✅ | 🚫 | 🚫 |
|
||||
| **MoE architecture** | ✅ | ❓ | ✅ | ✅ | ✅ | ❓ | Only -ngl 0 | 🚫 | 🚫 |
|
||||
|
||||
* ¹: IQ3_S and IQ1_S, see #5886
|
||||
Reference in New Issue
Block a user