mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2026-05-12 20:14:09 +00:00
This commit adds a check in the batch allocator to ensure that when backend sampling is enabled, at most one output token is specified per sequence.
30 KiB
30 KiB