llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-05-15 13:34:06 +00:00

Author	SHA1	Message	Date
Katostrofik	9ed6e19b9d	SYCL: fix multi-GPU system RAM exhaustion by using Level Zero allocations (#21597 ) * SYCL: fix multi-GPU system RAM exhaustion by using Level Zero allocations Replace sycl::malloc_device with zeMemAllocDevice for GPU memory allocation in the SYCL backend. sycl::malloc_device triggers the xe kernel driver's DMA-buf/TTM path which mirrors every VRAM allocation 1:1 in system RAM. zeMemAllocDevice uses the SVM/P2P path with no host staging. On a dual Intel Arc Pro B70 system (64GB VRAM, 64GB RAM), a 15.6 GiB model consumed 60 GiB of system RAM via sycl::malloc_device, causing OOM crashes. With zeMemAllocDevice, the same workload uses ~6.7 GiB of system RAM with no performance regression. All Level Zero calls include automatic fallback to the original SYCL allocation path if Level Zero interop is unavailable. * SYCL: address review feedback - remove try/catch, check device types, deduplicate - Remove try/catch from malloc/free/memcpy helpers, check backend and device type upfront instead (ggml_sycl_is_level_zero, ggml_sycl_is_dgpu) - Move shared helpers (is_level_zero, is_dgpu, free_device) to common.cpp and declare in common.hpp to eliminate code duplication - Use SYCL_CHECK(CHECK_TRY_ERROR()) for fallback sycl::free calls - Guard dev2dev_memcpy L0 path to dGPU-to-dGPU only, preserving the host-staged path for iGPU-to-dGPU transfers - Add Windows Level Zero SDK path detection (LEVEL_ZERO_V1_SDK_PATH) in CMakeLists.txt (co-authored with @arthw) * SYCL: add build/runtime flags for Level Zero, address review feedback Implements the architecture suggested by @arthw: compile-time and runtime flags to cleanly separate Level Zero and SYCL memory API paths. - Add GGML_SYCL_SUPPORT_LEVEL_ZERO cmake option (default ON). All Level Zero code is wrapped in #ifdef so the build works on systems without the Level Zero SDK installed (e.g. CPU-only CI servers). Both the loader library and headers are checked before enabling. - Add GGML_SYCL_ENABLE_LEVEL_ZERO runtime env var (default 1). Controls whether Level Zero or SYCL memory APIs are used. Only one API style is used per session, no mixing. If Level Zero is enabled but the devices don't support the Level Zero backend, it auto-disables with a warning. - Remove Level Zero code from dpct_malloc. It was unused (dpct::device_memory is not called anywhere in the backend) and used try/catch for flow control. - Update SYCL.md with documentation for both new parameters. Tested on Intel Arc Pro B70 (32GB), single-GPU and dual-GPU, with both GGML_SYCL_SUPPORT_LEVEL_ZERO=ON and OFF builds. AI-assisted development (Claude). Code reviewed and tested on my hardware. * SYCL: unify Level Zero malloc/free call sites, address review feedback Move ggml_sycl_malloc_device to common.cpp alongside ggml_sycl_free_device. Both functions are now unconditionally available — Level Zero code is #ifdef'd inside the functions, not at call sites. All call sites use uniform SYCL_CHECK(CHECK_TRY_ERROR()) wrapping with no #ifdef blocks. Addresses arthw's review: wrap all malloc/free in SYCL_CHECK for stack traces on failure, eliminate duplicated #ifdef/else patterns at 6 call sites (-29 lines net). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * SYCL: add Level Zero SDK to CI, fix device check and missed alloc paths Add Level Zero SDK installation to Ubuntu and Windows SYCL CI jobs so the Level Zero code path is compiled and tested in CI. Fix two bugs found during extended dual-GPU testing (no ONEAPI_DEVICE_SELECTOR set): - The Level Zero backend check was iterating all SYCL devices including CPU. The OpenCL CPU device caused Level Zero to be disabled for the GPUs, defeating the fix on multi-GPU systems. Added is_gpu() filter so only GPU devices are checked. - sycl_ext_malloc_device/sycl_ext_free (tensor reorder temp buffers) were still calling sycl::malloc/sycl::free directly, bypassing the Level Zero path. Routed through ggml_sycl_malloc_device/free_device for consistency with the other device memory call sites. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * SYCL: address arthw review feedback on Level Zero memory API structure - Move ggml_sycl_malloc_device to static function in ggml-sycl.cpp; only ggml_sycl_free_device (used by common.cpp) stays in common.cpp - Switch both helpers to use g_ggml_sycl_enable_level_zero global instead of per-call queue backend checks - Remove #ifdef wrapper from global definition; always declare at 0, add #else branch in init block so it stays 0 when L0 not compiled in - Update init loop comment to explain GPU-only device check - CMakeLists: message(STATUS) before the if block; align option wording AI-assisted implementation. Reviewed and tested on dual Intel Arc Pro B70 (32 GB each): test-backend-ops OK on both GPUs, single/dual-GPU Q4_K_M and Q8_0 bench correct, zeMemAllocDevice GTT delta confirmed <5 MiB per 4 GiB allocation (vs ~4 GiB shadow with sycl::malloc_device). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * SYCL: remove unused cstdio/cstdlib includes from common.cpp Leftover from the deleted ggml_sycl_queue_supports_level_zero helper. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * Apply suggestions from code review Co-authored-by: Neo Zhang <zhang.jianyu@outlook.com> * SYCL: preserve Level Zero allocation path during early malloc * ci: fix Level Zero package conflict in Intel Docker build * ci: find Level Zero loader in oneAPI package step * ci: allow Windows SYCL package without Level Zero DLL --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Neo Zhang <zhang.jianyu@outlook.com>	2026-05-14 13:39:14 +08:00
Xuan-Son Nguyen	3796c94bad	ci: validate model naming convention (#22680 ) * ci: validate model naming convention * bring back dedicated ec workflow * add missing jobs --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>	2026-05-13 10:59:37 +02:00
Masashi Yoshimura	927dada6c9	ggml-webgpu: Enables running gpt-oss-20b (#22906 ) * Enable to run gpt-oss-20b and refactor mulmat-q * disable test-backend-ops in ubuntu-24-webgpu	2026-05-12 07:27:40 -07:00
Sigbjørn Skjæret	fa62042af9	ci : bump ty to 0.0.35 (#22961 )	2026-05-12 11:34:10 +02:00
Kevin Pouget	928b486b0c	ggml-virtgpu: Add a GHA build check (#22943 ) * [ggml-virtgpu] Add a GHA build check * Apply suggestions from code review Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>	2026-05-11 21:38:22 +08:00
Georgi Gerganov	a290ce6266	gguf-py : bump version to 0.19.0 (#22664 ) * gguf-py : bump version to 0.19.0 * bump poetry --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>	2026-05-06 14:46:14 +02:00
Sigbjørn Skjæret	6118c043b1	ci : bump ty to 0.0.33 (#22535 ) * bump ty to 0.0.33 * update typings	2026-04-30 16:15:54 +03:00
Shreya Jain	a702f39597	CI Snapdragon: Switch ubuntu-latest to ubuntu-slim runner (#22303 ) * switch ubuntu-latest to ubuntu-slim * Fix the path for upload so CI doesn't fail * Update .github/workflows/build-and-test-snapdragon.yml Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Use -slim image for key check and consistent naming for artifact dir Signed-off-by: Max Krasnyansky <maxk@qti.qualcomm.com> * Remove check-secret extra job * move QDC key check for Run QDC jobs step specifically * add a step before to check the secret for qdc jobs --------- Signed-off-by: Max Krasnyansky <maxk@qti.qualcomm.com> Co-authored-by: Max Krasnyansky <maxk@qti.qualcomm.com> Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>	2026-04-24 21:21:36 +02:00
Shreya Jain	187a456370	Enable testing on Snapdragon devices (#21051 ) * Add the tests that we want to run on external CI * remove extra files * Fixes python issues, reove the deadlock on CI * remove unecessary changes * use override to ty.toml * fix pre-commit and try tests with secret in external repo not upstream * skip if key is unavailable * Fix feedback * switch hexagon to snapdragon * cleanup * fix secrets * remove the copyrights at the top of the files	2026-04-23 13:08:10 -07:00
Sigbjørn Skjæret	0949beb5a3	fix build number for sycl release (#22283 )	2026-04-23 21:38:58 +08:00
Neo Zhang Jianyu	4ead6fd957	[SYCL] Update oneapi 2025.3.3, Seperate SYCL build, release Ubuntu 24 package. (#22078 ) * upgrade oneAPI to 2025.3.3 * update * seperate SYCL CI and support release binary package for ubuntu 24 * add dependence * remove wrong copy lines * add missed line * remove other task to test the release for SYCL * rm more for test release * fix file name * correct the error in running * support build for fp32/fp16 * rm ubuntu-24-sycl-fp16 for duplicated * refactor build setting * update guide for ubuntu 24 release package, restore the release.yml for other backend * user docker replace to install oneAPI * use download installation package to replace docker * use wget to download and install oneapi, replace the apt cmd * enable ccache for oneAPI installation * fix format error * enable cache for oneAPI installation * update guide * Update .github/workflows/release.yml Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update .github/workflows/release.yml Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update .github/workflows/build-sycl.yml Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update .github/workflows/release.yml Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>	2026-04-23 08:21:36 +03:00
Zijun Yu	52f1096f21	openvino: driver setup, CI split, thread safety, and NPU optimizations (#21944 ) * Thread safety per request only * Fix ROPE yarn case * Fix sticky stateful config * Use i4/i8 directly for symmetric quant * Use weightless caching * Add WeightlessCacheAttribute to reduce NPU memory usage * Gelu tanh support (#125) * Imrope support (#126) * fix(openvino): explicit ov::Tensor frees in ggml_backend_openvino_free * add GPU,NPU support in OV Dockerfile * add build-openvino.yml ci * Fix sticky stateful config * add concurrency to ov-gpu ci runs. Move OV CI to build-openvino.yml * fix thread-safety of shared runtime context * rope type abstraction for frontend translations * fix editorconfig --------- Co-authored-by: Mustafa Cavus <mustafa.cavus@intel.com> Co-authored-by: Dan Hoffman <dhoff749@gmail.com> Co-authored-by: Ravi Panchumarthy <ravi.panchumarthy@intel.com>	2026-04-21 18:58:34 +03:00
Sigbjørn Skjæret	037bfe38d0	ci : install spirv-headers for vulkan-cross (#22109 )	2026-04-19 10:32:08 +03:00
Sigbjørn Skjæret	83d58e02fc	ci : free disk space for rocm release (#22012 )	2026-04-18 09:37:30 +02:00
Reese Levine	45cac7ca70	ggml-webgpu: fix compiler warnings and refactor FlashAttention encoding (#21052 ) * Update workflows to remove dependence on llvmpipe * Try setting Dawn_DIR * remove c++20 initializers * Move to proper guid * Try avoiding segfaults on vulkan backend process exit * Remove compiler warnings on parameter casting * Fix soft_max and update reg_tile accumulation to f32 for better precision * Refactor flash_attn a bit * remove c++20 initializers and format * Increase div precision for NVIDIA * revert div precision and comment out ggml-ci node for now * Formatting * Try debugging on a failing CI node * Revert "Try debugging on a failing CI node" This reverts commit `1971e33cba`.	2026-04-17 09:17:11 -07:00
Yuri Khrustalev	a279d0f0f4	ci : add android arm64 build and release (#21647 ) * server: respect the ignore eos flag * ci: add android arm64 build and release * patch * pin android-setup actions to v4 * Apply suggestions from code review Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * lf in the suggestion --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>	2026-04-17 11:32:24 +02:00
Ludovic Henry	8612ed18b7	ci : Use ggml-org/ccache-action on RISC-V as well (#21632 )	2026-04-16 11:11:25 +03:00
Ruben Ortlam	8dc530b86d	ci: disable test-backend-ops on Vulkan llvmpipe run and resture default timeout (#21901 )	2026-04-15 10:55:21 +02:00
Jeff Bolz	1f30ac0cea	vulkan: Programmatically add RoundingModeRTE to all shaders when the device supports it (#21572 ) * vulkan: Programmatically add RoundingModeRTE to all shaders when the device supports it * use FetchContent to get SPIRV-Headers * Fetch spirv-headers unconditionally * remove fetchcontent, rely on installed headers * fix ubuntu job * Update docs/build.md	2026-04-14 15:17:45 +02:00
Georgi Gerganov	f4b5bf2f32	ci : re-enable mac workflows (#21894 ) * ci : re-enable mac workflows * vulkan : fix compile warning	2026-04-14 15:58:09 +03:00
Christian Kastner	a8bad3842e	ci: Also exempt 'security' tag from auto-close (#21844 )	2026-04-14 01:18:44 +08:00
Martin Klacer	5c4aae66e1	devops: kleidiai: provide KleidiAI-Enabled ARM Release Artifact (#21259 ) * Unified macOS release setup with strategy-matrix block * Added KleidiAI arm64 macOS release definition Change-Id: I05520889ffc646488a178d06817a17f29274465a Signed-off-by: Martin Klacer <martin.klacer@arm.com>	2026-04-08 13:06:12 +08:00
Ludovic Henry	761797ffdf	ci : use default RISE RISC-V Runners (#21263 )	2026-04-05 20:29:48 +02:00
M1DNYT3	c08d28d088	ci: lower cuda12 floor to 12.8.1 for broader host compatibility (#21438 ) Co-authored-by: M1DNYT3 <m1dnyt3@MacBookPro.lan>	2026-04-05 09:04:00 +08:00
Nicholas Sparks	661e9acb36	ci: fix vulkan workflow referencing non-existent action (#21442 )	2026-04-05 08:59:51 +08:00
Masato Nakasaka	e439700992	ci: Add Windows Vulkan backend testing on Intel (#21292 ) * experimenting CI * Experimenting CI fix for MinGW * experimenting CI on Windows * modified script for integration with VisualStudio * added proxy handling * adding python version for Windows execution * fix iterator::end() dereference * fixed proxy handling * Fix errors occurring on Windows * fixed ci script * Reverted to master * Stripping test items to simplify Windows test * adjusting script for windows testing * Changed shell * Fixed shell * Fixed shell * Fix CI setting * Fix CI setting * Fix CI setting * Experimenting ci fix * Experimenting ci fix * Experimenting ci fix * Experimenting ci fix * experimenting fix for unit test error * Changed to use BUILD_LOW_PERF to skip python tests * Fix CI * Added option to specify Ninja generator * Reverted proxy related changes	2026-04-03 20:16:44 +03:00
M1DNYT3	277ff5fff7	docker : bump cuda12 to 12.9.1 (#20920 ) Co-authored-by: M1DNYT3 <m1dnyt3@MacBookPro.lan> Co-authored-by: CISC <CISC@users.noreply.github.com>	2026-04-03 15:06:45 +02:00
uvos	43a4ee4a2c	HIP: build eatch ci build test for a different architecture (#21337 ) This helps improve our chances of finding build failures before the release workflow builds for all architectures.	2026-04-03 11:38:22 +02:00
Slobodan Josic	7c7d6ce5c7	[HIP] Bump ROCm version to 7.2.1 (#21066 ) Bump ROCm version on Linux from 7.2 to 7.2.1 Add gfx1102 target Delete LLVM workaround since ROCm 7.2.1 has fix for ROCm 7.2 perf regression https://github.com/ROCm/rocm-systems/issues/2865 --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>	2026-04-03 00:59:20 +02:00
Nikhil Jain	5a0ed5150a	Update Dawn version in WebGPU CI (#20784 ) * Pin Dawn version * Update docs with new Dawn commit hash	2026-04-01 09:53:05 -07:00
Seungmin Kim	eec6f85d7b	CI: Enable CPU and Vulkan ARM64 Release (#21207 )	2026-03-31 19:02:56 +08:00
Seungmin Kim	84ae8434d0	CI : Enable CUDA and Vulkan ARM64 runners and fix CI/CD (#21122 ) * CI: Enable CUDA and Vulkan ARM64 runners and fix CI/CD Co-authored-by: Ts-sound <44093942+Ts-sound@users.noreply.github.com> * Obtain source tag name from git tag Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> --------- Co-authored-by: Ts-sound <44093942+Ts-sound@users.noreply.github.com> Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>	2026-03-30 20:24:37 +02:00
Sigbjørn Skjæret	e2eb39e81c	ci : bump ty to 0.0.26 (#21156 ) * fix incorrect type ignore comments * bump ty to 0.0.26	2026-03-30 09:29:15 +02:00
Ts-sound	bf934f28db	docker : fix and enable ARM64 image build (#20929 ) * CI: fix ARM64 image build error & enable compilation * Update .github/workflows/docker.yml Co-authored-by: Aaron Teo <taronaeo@gmail.com> * CI: revert ggml/src/ggml-cpu/CMakeLists.txt * Update .github/workflows/docker.yml Co-authored-by: Aaron Teo <taronaeo@gmail.com> * CI: update runs-on to ubuntu24.04, and update ARM64 build image ( ubuntu_version: "24.04") * CI: change cpu.Dockerfile gcc to 14; * CI : cpu.Dockerfile , update pip install . * Update .github/workflows/docker.yml Co-authored-by: Aaron Teo <taronaeo@gmail.com> --------- Co-authored-by: Aaron Teo <taronaeo@gmail.com>	2026-03-28 01:45:09 +01:00
KokerZhou	6861f6509a	CANN: update docker images to 8.5.0 and improve CANN.md (#20801 ) * cann: update docker images to 8.5.0 - bump CANN base image from 8.3.rc2 to 8.5.0 - bump ASCEND_VERSION from 8.1.RC1.alpha001 to 8.5.0 Move to newer stable releases. * cann: update CANN.md * Update CANN.md to include BF16 support Added BF16 support information to the CANN documentation and corrected formatting for the installation instructions. * Fix formatting issues in CANN.md Fix 234: Trailing whitespace	2026-03-27 08:53:00 +08:00
Xuan-Son Nguyen	8c60b8a2be	ci: pin external actions to exact commit SHA (#21033 )	2026-03-26 20:44:00 +01:00
uvos	ec54ac13a8	ci : fix parsing of vgpr counts in hip-quality-check (#20987 ) * scripts: hip: gcn-cdna-vgpr-check: fix parsing of vgpr counts when an amdclang Remark block is interlieved with another from a different process * Return warning ignore * obay pep8 inline double space before inline commets * add # noqa: NP100 for other prints too * Add script changes to cause autotrigger	2026-03-25 19:00:37 +01:00
Shreya Jain	345de3cd87	Use docker in build-android.yml (#20928 ) * use docker instead of SDK separately * fix whitespaces * Update .github/workflows/build-android.yml Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> --------- Co-authored-by: Max Krasnyansky <maxk@qti.qualcomm.com> Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>	2026-03-25 09:36:27 -07:00
Masato Nakasaka	b2704f9028	ci: Allow ninja to be used during unit test (#20742 ) * Remove make dependency * Added option to specify Ninja generator * use ninja-build as default for several CI * Revert "use ninja-build as default for several CI" This reverts commit `f552c4559b`. * changed use plain string rather than arrays * Enabled ninja build by default for experimentation * ci: add run.sh to test conditions to trigger GitHub CI and self-hosted runners Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * Enabled ninja build by default on self-hosted envs for experimentation * ci: revert generator to ninja instead of ninja multi-config Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ci: install ninja-build for self-hosted workflows Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ci: revert ninja from self-hosted runners Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ci: missed one self-hosted step Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ci: fix windows ci errors from an errenous revert Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * Added explicit build types for Ninja Also reverted some needless change * ci: use ninja multi-config for vulkan-x64 build Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * added time command to measure build time * Keeping some configs to use Ninja which show improvement * minor fix based on review Co-authored-by: Aaron Teo <taronaeo@gmail.com> * ci: rm `time` from custom containers Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> --------- Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> Co-authored-by: Aaron Teo <aaron.teo1@ibm.com> Co-authored-by: Aaron Teo <taronaeo@gmail.com>	2026-03-25 21:00:49 +08:00
Georgi Gerganov	3fab96cd04	ci : disable self-hosted mac jobs (#20985 )	2026-03-25 14:46:40 +02:00
Sigbjørn Skjæret	403c9c9cef	ci : bump gguf publish python version (#20982 )	2026-03-25 11:04:59 +02:00
Sigbjørn Skjæret	8fc85db9d2	ci : limit requirements versions (#20980 ) * set requests version * limit versions outside requirements	2026-03-25 10:55:37 +02:00
Georgi Gerganov	e32d243849	ai : update gh permissions (#20895 )	2026-03-23 13:21:41 +02:00
Sigbjørn Skjæret	29b28a9824	ci : switch from pyright to ty (#20826 ) * type fixes * switch to ty * tweak rules * tweak more rules * more tweaks * final tweak * use common import-not-found rule	2026-03-21 08:54:34 +01:00
Georgi Gerganov	4cb7e0bd61	ai : limit runtime of the agent (#20816 )	2026-03-20 20:31:25 +02:00
Georgi Gerganov	b31b30f31d	ai : do not run bash commands in the prompt (#20810 )	2026-03-20 19:06:33 +02:00
Georgi Gerganov	464fd0e71f	ai : update find-related action (#20790 ) * ai : update "related issues" prompt * cont * cont * cont	2026-03-20 10:28:14 +02:00
Georgi Gerganov	6c72646a61	ci : improve action for duplicate issue (#20772 ) * ci : show thinking traces of the agent * cont : increase thinking * cont : remove agent files * cont : move the model selection to the provider	2026-03-19 21:11:53 +02:00
Georgi Gerganov	900efd531d	ci : clarify gh command for viewing issues (#20766 )	2026-03-19 18:43:54 +02:00
uvos	b49d8b8757	ci : add hip quality check (#20430 ) * CI: add hip quality check * Update scripts/hip/gcn-cdna-vgpr-check.py Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update .github/workflows/hip-quality-check.yml Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update .github/workflows/hip-quality-check.yml Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update .github/workflows/hip-quality-check.yml Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update scripts/hip/gcn-cdna-vgpr-check.py Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update scripts/hip/gcn-cdna-vgpr-check.py Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update scripts/hip/gcn-cdna-vgpr-check.py Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update scripts/hip/gcn-cdna-vgpr-check.py Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Revert "Update .github/workflows/hip-quality-check.yml" This reverts commit efa0bfcdb01dfac0feee674987a0482d50f46145. * scripts: gcn-cdna-vgpr-check.py: enforce int type for total_vgprs * scripts: gcn-cdna-vgpr-check.py: add flash attention instances to ignore list * Bump ccache version * Add mssing seperators to list --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>	2026-03-19 17:05:44 +01:00

1 2 3 4 5 ...

497 Commits