Files
Georgi Gerganov 7bed317f53 models : fix the attn_factor for mistral3 graphs + improve consistency (#17945)
* models : fix the attn_factor for mistral3 graphs

* cont : rework attn_factor correction logic

* cont : make deepseek2 consistent

* cont : add TODO

* cont : special-case DSv2

* cont : revert Mistral 3 Large changes

* cont : fix DS2 to use the original attn_factor

* cont : minor comments
2025-12-12 17:12:40 +02:00
..
2025-09-05 17:32:39 -06:00
2025-09-05 17:32:39 -06:00
2025-11-28 12:02:56 +01:00