Bump patch version for release

Upd gguf-py/readme
Server: Enable setting default sampling parameters via command-line (#8402 )
2026-05-06 09:04:07 +00:00 · 2024-07-10 12:39:50 +03:00 · 2024-07-10 12:38:35 +03:00 · 2024-07-09 18:26:40 -04:00 · 2024-07-09 14:58:44 -04:00
4 changed files with 4 additions and 4 deletions
--- a/README.md
+++ b/README.md
@@ -453,7 +453,7 @@ To learn more how to measure perplexity using llama.cpp, [read this documentatio
 - [How to build](./docs/build.md)
 - [Running on Docker](./docs/docker.md)
 - [Build on Android](./docs/android.md)
- [Performance troubleshooting](./docs/token_generation_performance_tips.md)
+- [Performance troubleshooting](./docs/development/token_generation_performance_tips.md)
 - [GGML tips & tricks](https://github.com/ggerganov/llama.cpp/wiki/GGML-Tips-&-Tricks)

 **Seminal papers and background on the models**
--- a/examples/server/server.cpp
+++ b/examples/server/server.cpp
@@ -884,7 +884,8 @@ struct server_context {

    bool launch_slot_with_task(server_slot & slot, const server_task & task) {
        slot_params default_params;
-        llama_sampling_params default_sparams;
+        // Sampling parameter defaults are loaded from the global server context (but individual requests can still override them)
+        llama_sampling_params default_sparams = params.sparams;
        auto & data = task.data;

        if (data.count("__oaicompat") != 0) {
--- a/gguf-py/README.md
+++ b/gguf-py/README.md
@@ -79,5 +79,4 @@ python -m twine upload dist/*
 ```

 ## TODO
- [ ] Add tests
 - [ ] Include conversion scripts as command line entry points in this package.
--- a/gguf-py/pyproject.toml
+++ b/gguf-py/pyproject.toml
@@ -1,6 +1,6 @@
 [tool.poetry]
 name = "gguf"
-version = "0.9.0"
+version = "0.9.1"
 description = "Read and write ML models in GGUF for GGML"
 authors = ["GGML <ggml@ggml.ai>"]
 packages = [
Author	SHA1	Message	Date
M. Yusuf Sarıgöz	ff137fbbed	Bump patch version for release	2024-07-10 12:39:50 +03:00
M. Yusuf Sarıgöz	f6a3321701	Upd gguf-py/readme	2024-07-10 12:38:35 +03:00
Clint Herron	a59f8fdc85	Server: Enable setting default sampling parameters via command-line (#8402 ) * Load server sampling parameters from the server context by default. * Wordsmithing comment	2024-07-09 18:26:40 -04:00
Andy Salerno	fd560fe680	Update README.md to fix broken link to docs (#8399 ) Update the "Performance troubleshooting" doc link to be correct - the file was moved into a dir called 'development'	2024-07-09 14:58:44 -04:00