llama.cpp/tests/test-backend-sampler.cpp at 0da7e7dcccfac8a75bf3f65ac54cb4ea6b200c56

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-05-12 20:14:09 +00:00

Files

Daniel Bevenius 311c1a347f sampling : ensure at most one output token per seq

This commit adds a check in the batch allocator to ensure that when
backend sampling is enabled, at most one output token is specified per
sequence.

2025-11-18 16:06:23 +01:00

View Raw