llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-05-02 07:04:19 +00:00

Files

Daniel Bevenius 25f40ca65f completion : simplify batch (embd) processing (#19286 )

* completion : simplify batch (embd) processing

This commit simplifies the processing of embd by removing the for loop
that currently exists which uses params.n_batch as its increment. This
commit also removes the clamping of n_eval as the size of embd is always
at most the size of params.n_batch.

The motivation is to clarify the code as it is currently a little
confusing when looking at this for loop in isolation and thinking that
it can process multiple batches.

* add an assert to verify n_eval is not greater than n_batch

2026-02-04 05:43:28 +01:00

batched-bench

tool/ex/tests: consistently free ctx, then model (#18168 )

2025-12-22 11:00:37 +01:00

cli

common : use two decimal places for float arg help messages (#19048 )

2026-01-25 07:31:42 +01:00

completion

completion : simplify batch (embd) processing (#19286 )

2026-02-04 05:43:28 +01:00

cvector-generator

docs : Minor cleanups (#19252 )