mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2026-05-11 03:24:21 +00:00
There was a typo-like error, which would print the same number twice if request is received with n_predict > server-side config. Before the fix: ``` slot launch_slot_: id 0 | task 0 | n_predict = 4096 exceeds server configuration, setting to 4096 ``` After the fix: ``` slot launch_slot_: id 0 | task 0 | n_predict = 8192 exceeds server configuration, setting to 4096 ```
177 KiB
177 KiB