Sigbjørn Skjæret
4f02d47339
model : refactor bias tensor variable names ( #22079 )
...
* refactor bias tensor variable names
* use create_tensor_qkv for jina-bert-v2
2026-04-18 20:12:00 +02:00
PikaPikachu
9db77a020c
model : refactor QKV into common build_qkv and create_tensor_qkv helpers ( #21245 )
...
* model : refactor QKV into common build_qkv and create_tensor_qkv helpers
* model : extend build_qkv to bert/mpt/dbrx/olmo/lfm2/nemotron-h/granite-hybrid/gemma3n-iswa/t5-dec and fix wqkv_s
2026-04-16 17:41:34 +02:00
Sigbjørn Skjæret
f772f6e434
model : support NVFP4 tensors for Gemma4 ( #21971 )
...
* support nvfp4 tensors for Gemma4
* add wo_s to build_attn
* add wo_s to build_attn
* fix glm4
2026-04-16 16:51:47 +02:00
Georgi Gerganov
9f102a1407
models : move the token embedding norms to the first layer ( #20943 )
...
* models : move the token embedding norms to the first layer
* cont : fix LLM_TENSOR_CONV1D + fix il indexing
2026-03-24 17:00:30 +02:00
Xuan-Son Nguyen
59db9a357d
llama: dynamic head_dim and n_rot for SWA ( #20301 )
...
* llama: dynamic head_dim and n_rot for SWA
* also add gguf_writer wrappers
* fix build
* build_rope_shift arg reorder
2026-03-09 22:22:39 +01:00
Sigbjørn Skjæret
35bee031e1
graph : remove redundant scale_w parameter ( #20235 )
2026-03-08 18:58:28 +01:00
o7si
d0a6a31470
model : add support for JinaBertModel with non-gated ffn ( #18475 )
...
* WIP: Initial commit for fixing JinaBert original FF type support
* convert: add jina-v2-de tokenizer variant for German_Semantic_V3
* convert: fix token collision in BERT phantom vocab conversion
* convert: add feed_forward_type metadata
* model: add feed_forward_type metadata for jina-bert-v2
* model: jina-bert-v2 support standard GELU FFN variant
* model: remove ffn_type, detect FFN variant from tensor dimensions
* Update src/llama-model.cpp
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com >
* Update src/llama-model.cpp
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com >
* Update src/models/bert.cpp
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com >
* Update src/models/bert.cpp
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com >
* revert collision fix to be handled in separate PR
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com >
2026-01-01 18:38:51 +01:00
Piotr Wilkin (ilintar)
bea04522ff
refactor : llama-model.cpp ( #16252 )
...
* Sqashed: llama-model.cpp refactoring
* Fix formatting of attn / ffn / ffn_moe calls
* Fix import regression / unify spacing in models.h
* totally DID NOT miss those!
* Add missing qwen3vl(moe) models
* Add missing new .cpp files to build
* Remove extra semicolons
* Editor checker
* Update src/models/models.h
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com >
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com >
2025-10-31 23:40:23 +01:00