Skip to content

CI: add hip quality check#20430

Merged
CISC merged 14 commits intoggml-org:masterfrom
IMbackK:hipcheck
Mar 19, 2026
Merged

CI: add hip quality check#20430
CISC merged 14 commits intoggml-org:masterfrom
IMbackK:hipcheck

Conversation

@IMbackK
Copy link
Copy Markdown
Collaborator

@IMbackK IMbackK commented Mar 11, 2026

Add "hip-check" ci workflow, this workflow is intended to solve the following problems for us:

  1. The hip backend has had repeated problems with people adding #pragma unroll loops that cant be unrolled see ssm-conv, softmax and CUDA: use shared mem for ssm_conv #20128 (comment)
    • this workflow builds the hip backend with Werror to make it obvious when these occure
    • i did not make the hip builds Werror generally as we can not guarentee that there will be no warnings when built against every rocm version we support
  2. Currently we want to support rocm >=6.1 so build for 6.1 in the build workflow to check that no functions from newer rocm sneak in, however we build the release against the latest version of rocm, this causes the issue that we dont notice if a pr dosent build against the latest version until the release fails
    • This workflow builds against the same version of rocm as the release to check for this
  3. GCN/CDNA have a rather small 64 KiB vector register file and are wave 64, which means that only 256 registers are available. Quite often kernels get added that include significant register spill.
    • This pr adds a small script that is used in the workflow to check for kernels that spill significant registers
    • The script ignores kernels with significant spills currently in the code base (altho several of these should be investigated at some point)

The idea of this workflow that its failure NOT necessarily block a pr, but that its should prompt investigation / possible adding to the workflows witelist.

@github-actions github-actions Bot added script Script related python python script changes devops improvements to build systems and github actions ggml changes relating to the ggml tensor library for machine learning labels Mar 11, 2026
@IMbackK
Copy link
Copy Markdown
Collaborator Author

IMbackK commented Mar 11, 2026

theres a run here https://github.com/IMbackK/llama.cpp/actions/runs/22975551980/job/66703103556 to show how this works (the failure in the VGPR check is on purpose for illustration)

Comment thread .github/workflows/hip-quality-check.yml Outdated
Comment thread .github/workflows/hip-quality-check.yml Outdated
Comment thread .github/workflows/hip-quality-check.yml
Comment thread scripts/hip/gcn-cdna-vgpr-check.py
Comment thread scripts/hip/gcn-cdna-vgpr-check.py
Comment thread scripts/hip/gcn-cdna-vgpr-check.py Outdated
Comment thread scripts/hip/gcn-cdna-vgpr-check.py Outdated
Comment thread scripts/hip/gcn-cdna-vgpr-check.py Outdated
@IMbackK
Copy link
Copy Markdown
Collaborator Author

IMbackK commented Mar 11, 2026

@CISC the hip compiler will ignore #pragma clang diagnostic push when --save-temps is active. For this reason a build with -DGGML_HIP_EXPORT_METRICS=On and -Werror can not succeed as warning will be emmited here:

#pragma clang diagnostic push
among other places.

@CISC
Copy link
Copy Markdown
Member

CISC commented Mar 11, 2026

Better split the jobs again then.

@IMbackK
Copy link
Copy Markdown
Collaborator Author

IMbackK commented Mar 11, 2026

Right

@CISC
Copy link
Copy Markdown
Member

CISC commented Mar 12, 2026

Check CI failures.

@IMbackK
Copy link
Copy Markdown
Collaborator Author

IMbackK commented Mar 17, 2026

this pr should be in principal ready now, but i think this needs some buy in from the maintainers of the cuda backend this also affects.

@JohannesGaessler @am17an @slaren thoughts on principal?

@IMbackK IMbackK marked this pull request as ready for review March 17, 2026 18:32
@IMbackK IMbackK requested a review from a team as a code owner March 17, 2026 18:32
@CISC
Copy link
Copy Markdown
Member

CISC commented Mar 17, 2026

Comment thread .github/workflows/hip-quality-check.yml Outdated
Comment thread scripts/hip/gcn-cdna-vgpr-check.py
Comment thread scripts/hip/gcn-cdna-vgpr-check.py
@IMbackK
Copy link
Copy Markdown
Collaborator Author

IMbackK commented Mar 17, 2026

Guess we got one: https://github.com/ggml-org/llama.cpp/actions/runs/23210660849/job/67458238374?pr=20430#step:6:155

yeah i opened a pr for that: #20696

IMbackK and others added 11 commits March 18, 2026 19:53
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
@CISC CISC requested a review from a team March 19, 2026 10:47
'_ZL18flash_attn_ext_f16ILi128ELi128ELi32ELi1ELb1ELb0EEvPKcS1_S1_S1_S1_PKiPfP15HIP_vector_typeIfLj2EEffffjfiS5_IjLj3EEiiiiiiiiiiiliiliiiiil',
'_ZL18flash_attn_ext_f16ILi128ELi128ELi32ELi2ELb1ELb0EEvPKcS1_S1_S1_S1_PKiPfP15HIP_vector_typeIfLj2EEffffjfiS5_IjLj3EEiiiiiiiiiiiliiliiiiil',
'_ZL18flash_attn_ext_f16ILi128ELi128ELi4ELi8ELb1ELb0EEvPKcS1_S1_S1_S1_PKiPfP15HIP_vector_typeIfLj2EEffffjfiS5_IjLj3EEiiiiiiiiiiiliiliiiiil',
'_ZL18flash_attn_ext_f16ILi96ELi96ELi4ELi8ELb0ELb0EEvPKcS1_S1_S1_S1_PKiPfP15HIP_vector_typeIfLj2EEffffjfiS5_IjLj3EEiiiiiiiiiiiliiliiiiil',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we store the unmangled names? Are these not compiler dependent

Copy link
Copy Markdown
Collaborator Author

@IMbackK IMbackK Mar 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name mangling is in theory compiler dependent, however on linux all compilers follow the itanium abi spec for mangling so this is not an issue for this workflow. Additionally, HIP code can only be compiled by clang anyhow and this script depends on a specific feature of the amdclang fork (the metrics output).

The compiler dose not output demangled names in its hip metrics output so the only way to have unmangled names here would be for the script to unmangle the names itself in accordance to itanium spec im not sure this is worth it.

Copy link
Copy Markdown
Contributor

@JohannesGaessler JohannesGaessler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on static code analysis this seems correct to me.

@CISC CISC merged commit b49d8b8 into ggml-org:master Mar 19, 2026
51 checks passed
Ethan-a2 pushed a commit to Ethan-a2/llama.cpp that referenced this pull request Mar 20, 2026
* CI: add hip quality check

* Update scripts/hip/gcn-cdna-vgpr-check.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update .github/workflows/hip-quality-check.yml

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update .github/workflows/hip-quality-check.yml

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update .github/workflows/hip-quality-check.yml

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update scripts/hip/gcn-cdna-vgpr-check.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update scripts/hip/gcn-cdna-vgpr-check.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update scripts/hip/gcn-cdna-vgpr-check.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update scripts/hip/gcn-cdna-vgpr-check.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Revert "Update .github/workflows/hip-quality-check.yml"

This reverts commit efa0bfc.

* scripts: gcn-cdna-vgpr-check.py: enforce int type for total_vgprs

* scripts: gcn-cdna-vgpr-check.py: add flash attention instances to ignore list

* Bump ccache version

* Add mssing seperators to list

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
* CI: add hip quality check

* Update scripts/hip/gcn-cdna-vgpr-check.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update .github/workflows/hip-quality-check.yml

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update .github/workflows/hip-quality-check.yml

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update .github/workflows/hip-quality-check.yml

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update scripts/hip/gcn-cdna-vgpr-check.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update scripts/hip/gcn-cdna-vgpr-check.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update scripts/hip/gcn-cdna-vgpr-check.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update scripts/hip/gcn-cdna-vgpr-check.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Revert "Update .github/workflows/hip-quality-check.yml"

This reverts commit efa0bfc.

* scripts: gcn-cdna-vgpr-check.py: enforce int type for total_vgprs

* scripts: gcn-cdna-vgpr-check.py: add flash attention instances to ignore list

* Bump ccache version

* Add mssing seperators to list

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 1, 2026
* CI: add hip quality check

* Update scripts/hip/gcn-cdna-vgpr-check.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update .github/workflows/hip-quality-check.yml

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update .github/workflows/hip-quality-check.yml

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update .github/workflows/hip-quality-check.yml

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update scripts/hip/gcn-cdna-vgpr-check.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update scripts/hip/gcn-cdna-vgpr-check.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update scripts/hip/gcn-cdna-vgpr-check.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update scripts/hip/gcn-cdna-vgpr-check.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Revert "Update .github/workflows/hip-quality-check.yml"

This reverts commit efa0bfc.

* scripts: gcn-cdna-vgpr-check.py: enforce int type for total_vgprs

* scripts: gcn-cdna-vgpr-check.py: add flash attention instances to ignore list

* Bump ccache version

* Add mssing seperators to list

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

devops improvements to build systems and github actions ggml changes relating to the ggml tensor library for machine learning python python script changes script Script related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants