Files
llama.cpp/examples
Georgi Gerganov a2b96e0444 examples: add simplified llama-eval-new.py for AIME evaluation
- Create new simplified evaluation script focused only on AIME
- Implement EvalState and Processor dataclasses for structured state management
- Add real-time feedback showing correct/incorrect status per case
- Abstract grading interface for external grader support
- Use structured JSON output for eval state
- Apply HuggingFace dataset caching to avoid repeated downloads
- Remove Levenshtein matching - eval script only sends requests and validates answers
2026-05-10 18:13:44 +03:00
..
2026-04-28 09:07:33 +03:00
2026-04-28 09:07:33 +03:00
2026-05-08 06:54:57 +03:00