mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2026-05-14 13:04:08 +00:00
Add readme
This commit is contained in:
committed by
Georgi Gerganov
parent
4db4497ca7
commit
c7f3ce25f5
20
examples/llama-eval/README.md
Normal file
20
examples/llama-eval/README.md
Normal file
@@ -0,0 +1,20 @@
|
||||
# llama.cpp/example/llama-eval
|
||||
|
||||
The purpose of this example is to to run evaluations metrics against a an openapi api compatible LLM via http (llama-server).
|
||||
|
||||
```bash
|
||||
./llama-server -m model.gguf --port 8033
|
||||
```
|
||||
|
||||
```bash
|
||||
python examples/llama-eval/llama-eval.py --path_server http://localhost:8033 --n_prompt 100 --prompt_source arc
|
||||
```
|
||||
|
||||
## Supported tasks (MVP)
|
||||
|
||||
- **GSM8K** — grade-school math (final-answer only)
|
||||
- **AIME** — competition math (final-answer only)
|
||||
- **MMLU** — multi-domain knowledge (multiple choice)
|
||||
- **HellaSwag** — commonsense reasoning (multiple choice)
|
||||
- **ARC** — grade-school science reasoning (multiple choice)
|
||||
- **WinoGrande** — commonsense coreference resolution (multiple choice)
|
||||
@@ -576,7 +576,7 @@ if __name__ == "__main__":
|
||||
"--prompt_source",
|
||||
type=str,
|
||||
default="mmlu",
|
||||
help=f"Eval types supported: all,{TASK_DICT.keys()}",
|
||||
help=f"Eval types supported: all,{list(TASK_DICT.keys())}",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--n_prompts", type=int, default=None, help="Number of prompts to evaluate"
|
||||
|
||||
Reference in New Issue
Block a user