willjoha
ef22b3e4ac
docs: fix metrics endpoint description in server README (#22879)
* docs: fix metrics endpoint description in server README
Required model query parameter for router mode described.
Removed metrics:
- llamacpp:kv_cache_usage_ratio
- llamacpp:kv_cache_tokens
Added metrics:
- llamacpp:prompt_seconds_total
- llamacpp:tokens_predicted_seconds_total
- llamacpp:n_decode_total
- llamacpp:n_busy_slots_per_decode
* server: fix metrics type for n_busy_slots_per_decode metric
2026-05-11 18:32:26 +02:00
..
2026-04-17 11:11:46 +03:00
2026-05-11 19:09:43 +03:00
2026-05-04 08:52:07 +03:00
2026-04-17 11:11:46 +03:00
2026-04-17 11:11:46 +03:00
2026-04-21 09:54:36 +03:00
2026-04-17 11:11:46 +03:00
2026-04-17 11:11:46 +03:00
2026-04-28 09:07:33 +03:00
2026-05-07 14:01:01 +02:00
2026-04-17 11:11:46 +03:00
2026-04-21 09:54:36 +03:00
2026-04-17 11:11:46 +03:00
2026-04-17 11:11:46 +03:00
2026-04-27 17:25:09 +03:00
2026-05-11 18:32:26 +02:00
2026-04-17 11:11:46 +03:00
2026-04-17 11:11:46 +03:00
2026-03-08 12:30:21 +01:00