llama.cpp/tests/test-reasoning-budget.cpp at e48034dfc9e5705248fd39dc437ca887dc55a528

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-05-14 21:14:10 +00:00

Files

Jillis ter Hove 52e5f0a5c1 common : re-arm reasoning budget after DONE on new <think> (#22323 )

DONE state absorbs all tokens including a new start tag, causing any think blocks after the first to run unbudgeted. Observed on unsloth/Qwen3.6-27B-GGUF which interleaves multiple <think> blocks per response.

Fixed by advancing start_matcher in DONE branch and re-arming to COUNTING with a fresh budget on match. Adds regression test (test-reasoning-budget: test 6).

2026-04-28 19:15:36 +02:00

12 KiB

Raw Blame History

View Raw

12 KiB Raw Blame History

12 KiB

Raw Blame History