mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2026-05-10 11:04:06 +00:00
* vulkan: remove the need for the dryrun Allocate pipelines and descriptor sets when requested. Reallocate the prealloc buffers when needed, and flush any pending work before reallocating. For rms_partials and total_mul_mat_bytes, use the sizes computed the last time the graph was executed. * remove dryrun parameters