Files
llama.cpp/examples
Georgi Gerganov 0ca458d892 examples: implement flexible grader system for answer validation
- Add Grader class supporting regex and CLI-based grading
- Implement built-in regex patterns for AIME, GSM8K, MMLU, HellaSwag, ARC, WinoGrande
- Add CLI grader interface: python script.py --answer <pred> --expected <gold>
- Add HF telemetry disable to avoid warnings
- Support exact match requirement for regex patterns
- Add 30-second timeout for CLI grader
- Handle both boxed and plain text formats for AIME answers
2026-05-10 18:13:45 +03:00
..
2026-04-28 09:07:33 +03:00
2026-04-28 09:07:33 +03:00
2026-05-08 06:54:57 +03:00