llama.cpp/tools/server/server-context.cpp at db97837385edfbc772230debbd49e5efae843a71

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-05-08 01:54:10 +00:00

Files

Xuan-Son Nguyen c42712b056 server: support multiple generations from one prompt (OAI "n" option) (#17775 )

* backend support

* server: support multiple generations from one prompt (OAI "n" option)

* fix invalid batch

* format oai

* clean up

* disable ctx shift

* add test

* update comments

* fix style

* add n_cmpl to docs [no ci]

* allowing using both n_cmpl and n

2025-12-06 15:54:38 +01:00

149 KiB

Raw Blame History

View Raw

149 KiB Raw Blame History

149 KiB

Raw Blame History