llama.cpp/docs at fa595462ca2d5ea0feca20c41bd1431b3d4285ef - llama.cpp - Gitea: Git with a cup of tea

sdgoij/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-05-06 09:04:07 +00:00

Files

History

Georgi Gerganov 846262d787 docs : update speculative decoding parameters after refactor (#22397 ) (#22539 )

* docs : update speculative decoding parameters after refactor (#22397)

Update docs/speculative.md to reflect the new parameter naming scheme
introduced in PR #22397:

- Replace --draft-max/--draft-min with --spec-draft-n-max/--spec-draft-n-min
- Replace --spec-ngram-size-n/m with per-implementation variants
- Add documentation for all new --spec-ngram-*- parameters
- Update all example commands

Assisted-by: llama.cpp:local pi

* pi : add rule to use gh CLI for GitHub resources

Assisted-by: llama.cpp:local pi

* docs : run llama-gen-docs

* arg : fix typo

2026-05-04 08:52:07 +03:00

..

android: fix missing screenshots for Android.md (#18156 )

2025-12-19 09:32:04 +02:00

[SYCL] Optimize Q4_0 mul_mat for Arc770, add scripts (#22291 )

2026-04-25 09:20:14 +03:00

docs: more extensive RoPE documentation [no ci] (#21953 )

2026-04-15 14:45:16 +02:00

chore : correct typos [no ci] (#20041 )

2026-03-05 08:50:21 +01:00

ggml-webgpu: support for SSM_SCAN and disable set_rows error checking (#22327 )

2026-04-25 09:18:15 +03:00

android.md

android: fix missing screenshots for Android.md (#18156 )

2025-12-19 09:32:04 +02:00

autoparser.md

common/parser: add proper reasoning tag prefill reading (#20424 )

2026-03-19 16:58:21 +01:00

build-riscv64-spacemit.md

refactor : remove libcurl, use OpenSSL when available (#18828 )

2026-01-14 18:02:47 +01:00

build-s390x.md

docs: update s390x build docs (#19643 )

2026-02-16 00:33:34 +08:00

build.md

CUDA: require explicit opt-in for P2P access (#21910 )

2026-04-15 16:01:46 +02:00

docker.md

CI : Enable CUDA and Vulkan ARM64 runners and fix CI/CD (#21122 )

2026-03-30 20:24:37 +02:00

function-calling.md

common : implement new jinja template engine (#18462 )

2026-01-16 11:22:06 +01:00

install.md

docs : add "Quick start" section for new users (#13862 )

2025-06-03 13:09:36 +02:00

llguidance.md

llguidance build fixes for Windows (#11664 )

2025-02-14 12:46:08 -08:00

multimodal.md

docs: listing qwen3-asr and qwen3-omni as supported (#21857 )

2026-04-13 22:28:17 +02:00

ops.md

ggml-webgpu: support for SSM_SCAN and disable set_rows error checking (#22327 )

2026-04-25 09:18:15 +03:00

preset.md

preset: allow named remote preset (#18728 )

2026-01-10 15:12:29 +01:00

speculative.md

docs : update speculative decoding parameters after refactor (#22397 ) (#22539 )

2026-05-04 08:52:07 +03:00