Xuan-Son Nguyen
8f22dc0a53
model : add hunyuan moe (#14425)
* model : add hunyuan moe
* tokenizer ok
* fix tensor name
* cgraph init
* chat template
* wip
* almost working
* skip embed, fix bos
* cleanup
* yarn scaling
* cleanup
* correct rope type
* failed token fix
* ntk alpha freq_base
* tokenization working
* cleanup and pr changes
* vocab_size sanity check
* ntk alpha generic
* Update convert_hf_to_gguf.py
* Apply suggestions from code review
* fix regression
* fix style
---------
Co-authored-by: kooshi <1934337+kooshi@users.noreply.github.com>
2025-07-08 11:24:06 +03:00
..
2025-06-19 08:08:14 +03:00
2025-05-09 13:02:07 +02:00
2025-03-13 12:35:44 +02:00
2025-07-08 11:24:06 +03:00
2025-07-08 11:24:06 +03:00
2025-07-04 09:08:59 +03:00
2025-07-04 09:08:59 +03:00
2025-07-08 11:24:06 +03:00
2025-07-08 11:24:06 +03:00
2025-06-23 12:27:35 +03:00
2025-06-21 08:03:46 +03:00
2025-06-15 10:08:58 +03:00
2025-06-15 10:08:58 +03:00
2025-05-25 01:48:08 +01:00
2025-03-05 13:05:13 +00:00
2025-07-04 09:05:36 +03:00
2025-07-04 09:05:36 +03:00
2025-07-02 13:10:24 -04:00
2025-07-02 13:10:24 -04:00
2025-01-07 18:01:58 +01:00
2025-02-12 10:06:53 -04:00
2025-03-13 12:35:44 +02:00
2025-03-13 12:35:44 +02:00
2025-07-04 09:08:59 +03:00
2025-07-03 10:53:35 +03:00
2025-07-04 09:04:59 +03:00
2025-07-03 10:53:35 +03:00
2025-07-03 10:53:35 +03:00
2025-07-04 09:08:59 +03:00
2025-07-03 10:53:35 +03:00
2025-07-04 09:08:59 +03:00
2025-06-21 08:03:46 +03:00
2025-06-30 18:03:03 +03:00
2025-06-30 18:03:03 +03:00
2025-06-05 11:57:42 +02:00
2025-02-10 20:58:18 +02:00
2025-06-06 09:03:25 +02:00
2025-04-02 14:52:01 +02:00
2025-06-20 14:04:09 +02:00
2025-05-12 14:44:49 +02:00
2025-07-08 11:24:06 +03:00
2025-07-08 11:24:06 +03:00
2025-06-26 20:34:02 +03:00
2025-01-03 10:18:53 +02:00
2025-05-27 12:07:52 +03:00
2025-01-12 11:32:42 +02:00
2025-07-08 11:24:06 +03:00
2025-06-20 14:04:09 +02:00
2025-06-16 08:11:43 -07:00
2024-10-08 13:27:04 +02:00
2024-10-02 15:49:55 +02:00
2025-06-19 14:49:48 +02:00
2024-12-16 12:31:45 +02:00