llama.cpp/common at 0da7e7dcccfac8a75bf3f65ac54cb4ea6b200c56 - llama.cpp - Gitea: Git with a cup of tea

sdgoij/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-05-08 10:04:10 +00:00

Files

History

Daniel Bevenius 82957a90f2 sampling : always expose sampled_ids

This commit precomputes and caches the full-vocab token id list in
llama_context's constructor, so llama_get_backend_sampled_token_ids_ith
always returns a valid pointer.

The motivation for this is that this enables both common/sampling.cpp
and src/llama-sampling.cpp can simplify their logic.

Not all backends samplers that process logits need to set the
sampled_tokens_id as they may not change the order of the logits, for
example the temperature sampler only scales the logits but does not
change their order. Simliar the logit bias sampler only adds bias to
specific token ids but does not change the order of the logits. In
these cases there will not be a device to host copy of the sampled
token ids, and this is the use case where having this precomputed
list is useful.

2025-11-18 15:11:59 +01:00

..

arg.cpp

sampling : add support for backend sampling

2025-11-17 16:15:58 +01:00

arg.h

common: move download functions to download.(cpp|h) (#17059 )

2025-11-07 11:23:34 +01:00

base64.hpp

llava : expose as a shared library for downstream projects (#3613 )

2023-11-07 00:36:23 +03:00

build-info.cpp.in

cmake: Add ability to pass in LLAMA_BUILD_NUMBER/COMMIT (#14167 )

2025-06-13 10:38:52 +02:00

chat-parser.cpp

common : handle unicode during partial json parsing (#16526 )

2025-10-12 16:18:47 +03:00

chat-parser.h

model : Apertus model implementation (#15852 )

2025-10-02 20:43:22 +03:00

chat.cpp

common : move gpt-oss reasoning processing to init params (#16937 )

2025-11-02 16:56:28 +02:00

chat.h

chat: Add LFM2 tool handling (#16763 )

2025-10-27 23:54:01 +01:00

CMakeLists.txt

cmake : cleanup (#17199 )

2025-11-12 14:48:30 +02:00

common.cpp

sampling : add support for backend sampling

2025-11-17 16:15:58 +01:00

common.h

graph : do not include llama-model.h

2025-11-18 13:53:25 +02:00

console.cpp

console : utf-8 fix for windows stdin (#9690 )

2024-09-30 11:23:42 +03:00

console.h

gguf : new file format with flexible meta data (beta) (#2398 )

2023-08-21 23:07:43 +03:00

download.cpp

cmake : move OpenSSL linking to vendor/cpp-httplib (#17177 )

2025-11-12 12:32:50 +01:00

download.h

arg: add --cache-list argument to list cached models (#17073 )

2025-11-08 21:54:14 +01:00

http.h

common: introduce http.h for httplib-based client (#16373 )

2025-10-01 20:22:18 +03:00

json-partial.cpp

common : handle unicode during partial json parsing (#16526 )

2025-10-12 16:18:47 +03:00

json-partial.h

sync : vendor (#13901 )

2025-05-30 16:25:45 +03:00

json-schema-to-grammar.cpp

grammar : support array references in json schema (#16792 )

2025-10-28 09:37:52 +01:00

json-schema-to-grammar.h

sync : vendor (#13901 )

2025-05-30 16:25:45 +03:00

llguidance.cpp

sampling : add support for backend sampling

2025-11-17 16:15:58 +01:00

log.cpp

mtmd: add mtmd_log_set (#17268 )

2025-11-14 15:56:19 +01:00

log.h

mtmd: add mtmd_log_set (#17268 )

2025-11-14 15:56:19 +01:00

ngram-cache.cpp

ggml : portability fixes for VS 2017 (#12150 )

2025-03-04 18:53:26 +02:00

ngram-cache.h

llama : use LLAMA_TOKEN_NULL (#11062 )

2025-01-06 10:52:15 +02:00

regex-partial.cpp

common: add partial regex support (#12808 )

2025-05-14 19:50:57 +01:00

regex-partial.h

common: add partial regex support (#12808 )

2025-05-14 19:50:57 +01:00

sampling.cpp

sampling : always expose sampled_ids

2025-11-18 15:11:59 +01:00

sampling.h

sampling : add support for backend sampling

2025-11-17 16:15:58 +01:00

speculative.cpp

sampling : optimize samplers by reusing bucket sort (#15665 )

2025-08-31 20:41:02 +03:00

speculative.h

server : implement universal assisted decoding (#12635 )

2025-07-31 14:25:23 +02:00