llama.cpp/tools/server/server-task.cpp at 01cbdfd7eb3dd6c0512daddb487b4cf382a9b016

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-05-01 22:54:05 +00:00

Files

Radoslav Gerganov bcf7546160 server : add arg for disabling prompt caching (#18776 )

* server : add arg for disabling prompt caching

Disabling prompt caching is useful for clients who are restricted to
sending only OpenAI-compat requests and want deterministic
responses.

* address review comments

* address review comments

2026-01-12 19:21:34 +02:00

61 KiB

Raw Blame History

View Raw

61 KiB Raw Blame History

61 KiB

Raw Blame History