Files
llama.cpp/tools/server/server-task.h
Georgi Gerganov a4854f0349 cont : improve n_cmpl logic
- launch the parent task first so it finds the slot with best cache
- parent task waits for child tasks to be launched
- when a child task finishes - remove its cache
2026-01-09 15:36:58 +02:00

16 KiB