mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2026-03-17 16:44:07 +00:00
server: reset counter related to kill-switch on client error (#20513)
* server: reset kill-switch on client error This avoids triggering a server kill switch. If the client sends a request that exceeds the configured context size, an appropriate HTTP 400 response is provided and no tokens are generated. However since no tokens are generated, update_slots() increments n_empty_consecutive. If the client sends 3 such messages in a row, the server terminates. * moved counter reset as per recommendation * cont : minor --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
This commit is contained in:
@@ -1189,6 +1189,9 @@ private:
|
||||
? SLOT_STATE_WAIT_OTHER // wait for the parent to process prompt
|
||||
: SLOT_STATE_STARTED;
|
||||
|
||||
// reset server kill-switch counter
|
||||
n_empty_consecutive = 0;
|
||||
|
||||
SLT_INF(slot, "processing task, is_child = %d\n", slot.task->is_child());
|
||||
return true;
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user