Sascha Rogmann
455d8e4be8
server : speculative checkpointing (#19493)
* server : speculative decoding using checkpoints
* server : fix draft check with checkpoints
* server : rename spec vars
* server : log levels
* server : refactored spec logic to speculative.cpp
* server : renamed spec checkpoints option
* server : fix spec checkpoints, logging
* speculative : checkpoints with draft model, logging
* server : n_tokens_cur and create_checkpoint in draft
* server : fix server_speculative_callback (slot.id)
* spec : fix ngram-map/begin idx_last_check
* spec : init ckpt (begin() wasn't called)
* chore: update webui build output
* server : restore sampler in spec checkpoint and clear mem
* cont : avoid --spec-use-checkpoints argument
* cont : remove server_prompt_checkpoint_with_size
* spec : rename (leave_draft_state)
* cont : clean-up
* cont : do not ignore partial drafts even if the are short
* cont : spec callback owned by session
* cont : simplify
* cont : avoid empty speculative session
* cont : simplify
* cont : simplify
* cont : enable mtmd speculative decoding
* cont : keep the spec sampler alive
* cont : simplify
* cont : fix nullptr deref + draft checkpoints
* cont : remove common_speculative_accept_response
* cont : remove callback
* cont : simplify
* cont : minor
* cont : simplify
* cont : fix accepted number
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2026-04-19 10:24:06 +03:00
..
2026-04-09 11:28:33 +02:00
2026-04-17 11:11:46 +03:00
2026-01-08 13:53:54 +01:00
2023-11-07 00:36:23 +03:00
2026-04-17 11:11:46 +03:00
2026-04-17 11:11:46 +03:00
2026-04-15 10:51:50 +02:00
2026-03-23 22:21:47 -05:00
2026-04-03 09:07:59 +03:00
2026-04-15 10:51:50 +02:00
2026-04-15 10:51:50 +02:00
2026-04-15 10:51:50 +02:00
2026-04-04 20:39:00 +02:00
2026-04-19 10:24:06 +03:00
2026-04-04 20:39:00 +02:00
2026-04-17 11:11:46 +03:00
2026-04-17 11:11:46 +03:00
2026-04-19 10:24:06 +03:00
2026-04-06 20:54:06 +02:00
2026-03-05 10:47:28 +01:00
2026-02-04 17:55:31 +01:00
2026-03-05 08:50:21 +01:00
2026-04-17 11:11:46 +03:00
2026-04-13 11:18:23 +02:00
2026-04-17 11:11:46 +03:00
2026-03-26 12:04:37 +01:00
2026-03-09 17:47:54 +01:00
2025-11-18 18:54:15 +01:00
2026-01-20 18:23:25 +01:00
2026-03-28 17:55:38 +01:00
2025-12-16 04:05:23 -06:00
2026-01-04 22:22:16 +02:00
2026-04-17 11:11:46 +03:00
2026-04-17 11:11:46 +03:00
2026-01-28 19:42:42 +02:00
2026-01-28 19:42:42 +02:00
2026-04-19 10:24:06 +03:00
2026-03-31 13:50:51 +02:00
2026-01-30 18:21:48 +02:00
2026-01-30 21:27:27 +02:00
2026-04-13 18:18:18 -05:00
2026-04-13 18:18:18 -05:00
2026-01-10 15:12:29 +01:00
2026-01-08 22:35:40 +01:00
2026-03-27 18:30:40 +01:00
2026-03-27 18:30:40 +01:00
2026-03-16 08:50:38 +02:00
2025-05-14 19:50:57 +01:00
2026-04-14 12:43:06 +02:00
2026-01-04 22:22:16 +02:00
2026-04-19 10:24:06 +03:00
2026-04-19 10:24:06 +03:00
2026-03-11 10:26:12 +01:00
2026-03-11 10:26:12 +01:00