Skip to content

Improve evaluate-pr-tests workflow: pull_request_target, access gating, security docs#34678

Open
PureWeen wants to merge 22 commits intomainfrom
fix/evaluate-tests-fork-support
Open

Improve evaluate-pr-tests workflow: pull_request_target, access gating, security docs#34678
PureWeen wants to merge 22 commits intomainfrom
fix/evaluate-tests-fork-support

Conversation

@PureWeen
Copy link
Copy Markdown
Member

@PureWeen PureWeen commented Mar 26, 2026

Description

Overhauls the copilot-evaluate-tests gh-aw workflow for better security, fork support, and Copilot bot compatibility.

Changes

  1. Switch from pull_request to pull_request_target — runs workflow YAML from base branch (trusted), eliminates "Approve & Run" friction for collaborators
  2. Add copilot[bot] to bots: allowlist — Copilot-authored PRs auto-trigger evaluation without needing write collaborator status
  3. Add dry-run mode (suppress_comment input) — evaluate without posting comments, useful for testing
  4. Add noop tool guidance — agent calls noop when no action is needed instead of silently exiting
  5. Harden Checkout-GhAwPr.ps1 — null guard on $PrInfo, allow fork PRs from write-access authors
  6. Update security documentation — accurate credential model, defense layers table, workflow author rules

Trigger Behavior

Trigger When it fires Who can trigger
pull_request_target PR opened/updated/reopened touching src/**/tests/** Auto for write-access authors + copilot[bot]; blocked for external contributors via pre_activation role check
issue_comment /evaluate-tests comment on a PR Write-access collaborators (admin/maintain/write)
workflow_dispatch Manual trigger from Actions tab Write-access collaborators; fork PRs allowed if author has write access

Access Matrix

Permission Level pull_request_target /evaluate-tests comment workflow_dispatch
admin / maintain / write ✅ Auto-runs ✅ Can trigger ✅ Can trigger
copilot[bot] ✅ Auto-runs (via bots: allowlist) N/A N/A
triage / read pre_activation blocks pre_activation blocks ❌ Script skips (exit 0)
external / fork (no write) pre_activation blocks pre_activation blocks ❌ Script skips (exit 0)
fork (with write access) ✅ Auto-runs ✅ Can trigger ✅ Script proceeds

Security Model

Based on GitHub Security Lab guidance:

  • PR contents treated as passive data (read/analyze, never built or executed) ✅
  • Agent job has read-only permissions (contents: read, issues: read, pull-requests: read) ✅
  • Write operations in separate safe_outputs job (not the agent) ✅
  • permissions: {} at workflow level — no ambient write access ✅
  • Agent blast radius limited to max: 1 comment via safe-outputs ✅
  • Checkout-GhAwPr.ps1 restores .github/skills/ and .github/instructions/ from base branch for workflow_dispatch

Security Docs Update

Updated .github/instructions/gh-aw-workflows.instructions.md with:

  • Accurate credential model (COPILOT_TOKEN present in agent env via --env-all, defended by firewall + redaction + threat detection)
  • Defense layers table documenting what each layer does and does not protect against
  • Explicit DO/DON'T rules for workflow authors
  • Key principles from GitHub Security Lab's pwn-request guidance

Known Limitations

  • ready_for_review event type not supported by gh-aw compiler for pull_request_target — draft→ready transitions without a new push won't trigger the workflow
  • checkout_pr_branch.cjs overwrites .github/skills/ for pull_request_target/issue_comment triggers — accepted residual risk (agent sandboxed, output limited)

- Change trigger from pull_request to pull_request_target so fork PRs
  have access to secrets (COPILOT_GITHUB_TOKEN)
- Add roles: all to allow fork contributors (who have read permission)
  to trigger the workflow
- Remove forks: ["*"] (not needed with pull_request_target)
- Remove ready_for_review type (not supported by gh-aw for
  pull_request_target)
- Update if condition and gate step to reference pull_request_target

Validated on PureWeen/maui:
- Same-repo PR: all green (run 23603776593)
- Fork PR via workflow_dispatch: all green (run 23605610535)
- Fork PR via pull_request_target: all green (run 23606033617)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 26, 2026 16:46
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 26, 2026

🚀 Dogfood this PR with:

⚠️ WARNING: Do not do this without first carefully reviewing the code of this PR to satisfy yourself it is safe.

curl -fsSL https://github.com/dotnet/maui/main/eng/scripts/get-maui-pr.sh | bash -s -- 34678

Or

  • Run remotely in PowerShell:
iex "& { $(irm https://github.com/dotnet/maui/main/eng/scripts/get-maui-pr.ps1) } 34678"

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the gh-aw Evaluate PR Tests workflow triggers so fork PRs can be evaluated (by switching from pull_request to pull_request_target) while keeping the workflow’s gating/conditions aligned with the new event.

Changes:

  • Switched workflow trigger from pull_request to pull_request_target and updated related if: conditions.
  • Updated the gate step condition to run under pull_request_target.
  • Updated the compiled .lock.yml to reflect the new trigger configuration.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
.github/workflows/copilot-evaluate-tests.md Moves evaluation to pull_request_target and updates workflow conditions/gate accordingly.
.github/workflows/copilot-evaluate-tests.lock.yml Regenerated compiled workflow reflecting the trigger/condition changes.

Comment on lines +32 to 41
pull_request_target:
paths:
- src/**/tests/**
- src/**/test/**
types:
- opened
- synchronize
- reopened
- ready_for_review
# roles: all # Roles processed as role check in pre-activation job
workflow_dispatch:
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compiled lock file shows roles: all commented out under the on: block, which indicates the roles setting was not applied. This also coincides with the removal of the pre_activation/membership check job, so the workflow no longer gates who can run it. Fix by moving roles: to the correct top-level location in the .md frontmatter and re-compiling so the lock file includes the intended role check behavior.

Copilot uses AI. Check for mistakes.
needs: pre_activation
if: >
(needs.pre_activation.outputs.activated == 'true') && ((github.event_name == 'pull_request' && github.event.pull_request.draft == false) || github.event_name == 'workflow_dispatch' || (github.event_name == 'issue_comment' &&
(github.event_name == 'pull_request_target' && github.event.pull_request.draft == false) || github.event_name == 'workflow_dispatch' || (github.event_name == 'issue_comment' &&
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jobs.activation no longer depends on a role/membership gate (needs: pre_activation and the needs.pre_activation.outputs.activated check are gone). With pull_request_target, this means the workflow can run with secrets for any matching PR/comment, which is a significant security/cost exposure. After fixing the roles placement in the .md, ensure the compiled lock restores the gating (or add an explicit guard) before allowing activation to proceed.

Suggested change
(github.event_name == 'pull_request_target' && github.event.pull_request.draft == false) || github.event_name == 'workflow_dispatch' || (github.event_name == 'issue_comment' &&
(github.event_name == 'pull_request_target' && github.event.pull_request.draft == false && github.event.pull_request.head.repo.fork == false) || github.event_name == 'workflow_dispatch' || (github.event_name == 'issue_comment' &&

Copilot uses AI. Check for mistakes.
@PureWeen PureWeen marked this pull request as draft March 26, 2026 17:07
github-actions bot and others added 5 commits March 26, 2026 12:59
The workflow_dispatch step runs with GITHUB_TOKEN and checks out PR code.
Restrict it to only process PRs from authors with write/maintain/admin
access, preventing checkout of untrusted fork code in a privileged context.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move the PR author permission check from inline workflow bash into the
shared Checkout-GhAwPr.ps1 script. Any gh-aw workflow using this script
now automatically gates on the PR author having write/maintain/admin
access before checking out code.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Fork PRs are handled by pull_request_target (platform checkout in
sandboxed container). The workflow_dispatch path should only process
same-repo PRs from authors with write access.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Restoring only skills/, instructions/, and copilot-instructions.md left
other .github/ subdirs (pr-review/, scripts/, workflows/) from the PR
branch. Restore the entire .github/ directory for complete coverage.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Instead of deleting .github/ and restoring from main, merge the base
branch into the PR branch after checkout. This produces the same state
as a pull_request merge commit: PR changes + latest main. If the PR
modifies a skill, the PR version wins; otherwise main's version is used.

This lets contributors iterate on skills via workflow_dispatch while
keeping everything else current. On merge conflict, falls back to the
PR branch as-is with a warning.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen PureWeen marked this pull request as ready for review March 26, 2026 20:29
@dotnet dotnet deleted a comment from github-actions bot Mar 26, 2026
github-actions bot and others added 3 commits March 27, 2026 09:26
- pull_request_target: only auto-runs for OWNER/MEMBER/COLLABORATOR
- issue_comment: /evaluate-tests only accepted from OWNER/MEMBER/COLLABORATOR
- workflow_dispatch: unchanged
- External PRs require maintainer /evaluate-tests comment to trigger

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Revert merge strategy to targeted git checkout (works in shallow clones)
- Remove roles:all, restore gh-aw pre_activation with write-level checks
- Remove author_association from if: (gh-aw handles access gating)
- Update fork fallback message to remove stale workflow_dispatch advice

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add suppress_comment input for workflow_dispatch dry-run (evaluate without posting comment)
- Add explicit noop guidance so the agent uses it instead of silently exiting
- Update posting results section to respect dry-run mode

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen PureWeen changed the title Switch evaluate-pr-tests to pull_request_target for fork PR support Improve evaluate-pr-tests workflow: fork support, access gating, dry-run Mar 30, 2026
kubaflo
kubaflo previously approved these changes Mar 30, 2026
@PureWeen PureWeen marked this pull request as draft March 30, 2026 23:15
@PureWeen PureWeen marked this pull request as ready for review April 2, 2026 13:58
Prevents silent fork check bypass when gh returns empty/malformed
JSON — $null.isFork evaluates to $false in PowerShell, which would
let the fork check pass incorrectly.

Note: ready_for_review cannot be added to pull_request_target types
yet — gh-aw compiler doesn't include it in the allowed type list.
Filed as a known gap.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
github-actions bot and others added 3 commits April 2, 2026 15:37
- Fix inaccurate claim that agent has 'no ability to access secrets'
  — COPILOT_TOKEN is present via --env-all, defended by firewall +
  redaction + threat detection
- Add Security Boundaries section with principles from GitHub
  Security Lab's pwn-request guidance
- Add defense layers table documenting what each layer does/doesn't do
- Add explicit rules for workflow authors (DO/DON'T)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot-authored PRs are created by copilot[bot] which doesn't have
write collaborator access. The bots: allowlist lets the pre_activation
membership check pass for this known bot actor.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move permission check before fork check — fork PRs from collaborators
with write access should be checked out and evaluated. Only block PRs
from authors without write access (exit 0, not exit 1 — it's a skip,
not an error).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen PureWeen changed the title Improve evaluate-pr-tests workflow: fork support, access gating, dry-run Improve evaluate-pr-tests workflow: pull_request_target, access gating, security docs Apr 2, 2026
If there is nothing to evaluate (PR has no test files, PR is a docs-only change, etc.), you **must** call the `noop` tool with a message explaining why:

```json
{"noop": {"message": "No action needed: [brief explanation, e.g. 'PR contains no test files']"}}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to configure to not generate a no-op run report issue (within the frontmatter).

https://github.github.com/gh-aw/patterns/monitoring/#no-op-run-reports

Comment on lines +55 to +65
### Key Principles (from [GitHub Security Lab](https://securitylab.github.com/resources/github-actions-preventing-pwn-requests/))

1. **Never execute untrusted PR code with elevated credentials.** The classic "pwn-request" attack is `pull_request_target` + checkout PR + run build scripts with `GITHUB_TOKEN`. The attack surface includes build scripts (`make`, `build.ps1`), package manager hooks (`npm postinstall`, MSBuild targets), and test runners.

2. **Treating PR contents as passive data is safe.** Reading, analyzing, or diffing PR code is fine — the danger is *executing* it. Our gh-aw workflows read code for evaluation; they never build or run it.

3. **`pull_request_target` grants write permissions and secrets access.** This is by design — the workflow YAML comes from the base branch (trusted). But any step that checks out and runs fork code in this context creates a vulnerability.

4. **`pull_request` from forks has no secrets access.** GitHub withholds secrets because the workflow YAML comes from the fork (untrusted). This is the safe default for CI builds on fork PRs.

5. **The `workflow_run` pattern separates privilege from code execution.** Build in an unprivileged `pull_request` job → pass artifacts → process in a privileged `workflow_run` job. This is architecturally what gh-aw does: agent runs read-only, `safe_outputs` job has write permissions.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

Comment on lines +4 to +5
pull_request_target:
types: [opened, synchronize, reopened]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think this will still lead to the 'Approve and run workflows' button showing up for PRs from untrusted forks. We need to solidify the guidance we give for when to hit that button. I really wish that button navigated into a list of workflows needing approval for the PR with boxes to select which to approve.

Comment on lines +17 to +18
suppress_comment:
description: 'Dry-run — evaluate but do not post a comment on the PR'
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future-proofing: I suggest renaming to suppress_output in case the output changes (to a PR review for example).

Suggested change
suppress_comment:
description: 'Dry-run evaluate but do not post a comment on the PR'
suppress_output:
description: 'Dry-run - evaluate but do not post output on the PR'

type: boolean
default: false
bots:
- "copilot[bot]"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not certain what identity ends up getting used here; I experimented and I think it's one of these but not sure which. 😄

  • copilot
  • copilot[bot]
  • app/copilot-swe-agent
  • copilot-swe-agent
  • copilot-swe-agent[bot]

Comment on lines 25 to 29
if: >-
(github.event_name == 'pull_request' && github.event.pull_request.draft == false) ||
(github.event_name == 'pull_request_target' && github.event.pull_request.draft == false) ||
github.event_name == 'workflow_dispatch' ||
(github.event_name == 'issue_comment' &&
github.event.issue.pull_request &&
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always guard against forks as well, preventing the workflow from running on forks except for the workflow_dispatch event. Otherwise, PRs within a fork will result in failing workflow runs (vs. starting the workflow and skipping all jobs).

Simple case that needs adapting to your scenario: if: (!github.event.repository.fork) || github.event_name == 'workflow_dispatch'.

Comment on lines +127 to +129
When triggered via `workflow_dispatch` with `suppress_comment` = `${{ inputs.suppress_comment }}`:
- If **true**, perform the full evaluation but **do not** post a comment on the PR. Write the evaluation to the workflow log only. This is useful for testing the skill without spamming the PR.
- If **false** (default), post the comment as normal.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These expressions get replaced on their way to the model so this would end up embedding the true or false value into the opening statement. I think you need something more like this (but I recommend validating my understanding here). Note I also reflected my suggested input rename from above.

Suggested change
When triggered via `workflow_dispatch` with `suppress_comment` = `${{ inputs.suppress_comment }}`:
- If **true**, perform the full evaluation but **do not** post a comment on the PR. Write the evaluation to the workflow log only. This is useful for testing the skill without spamming the PR.
- If **false** (default), post the comment as normal.
When triggered via `workflow_dispatch`, the `suppress_output` input controls behavior.
- If `${{ inputs.suppress_output }}` == **true**, perform the full evaluation but **do not** post a comment on the PR. Write the evaluation to the workflow log only. This is useful for testing the skill without spamming the PR.
- If `${{ inputs.suppress_output }}` == **false** (default), post the comment as normal.

## Posting Results

Call `add_comment` with `item_number` set to the PR number. Wrap the report in a collapsible `<details>` block:
If dry-run mode is active (`suppress_comment` is true), log the evaluation report to stdout and stop — do **not** call `add_comment`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If dry-run mode is active (`suppress_comment` is true), log the evaluation report to stdout and stop — do **not** call `add_comment`.
If dry-run mode is active (`suppress_output` is true), log the evaluation report to stdout and stop — do **not** call `add_comment`.

github-actions bot and others added 3 commits April 7, 2026 15:31
…ports, dry-run wording

- Rename suppress_comment to suppress_output (future-proofs output type changes)
- Disable no-op run report issues (report-as-issue: false)
- Improve dry-run mode wording to clarify expression replacement behavior

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The correct identity for Copilot-authored PRs is copilot-swe-agent[bot]
(882 commits in this repo), not copilot[bot] (0 commits).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
gh pr diff fails with HTTP 406 for PRs with 300+ changed files.
Fall back to paginated REST API (pulls/files) when diff is too large.

Fixes the failure seen on PR #34617 (inflight candidate with 300+ files).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen
Copy link
Copy Markdown
Member Author

PureWeen commented Apr 8, 2026

Addressing Jeff's Review Feedback

Thanks for the thorough review @jeffhandley! Here's what we applied:

✅ Applied

  1. Renamed suppress_commentsuppress_output — Future-proofs if output changes to a PR review or other format. (commit f3dc6a9)

  2. Disabled no-op run report issues — Added report-as-issue: false under noop: in safe-outputs. Note: the docs suggest this goes under safe-outputs: noop:, not as a top-level property. (commit f3dc6a9)

  3. Improved dry-run mode wording — Clarified how ${{ inputs.suppress_output }} expression replacement works in the agent prompt. (commit f3dc6a9)

  4. Fixed bot identity — Changed bots: from copilot[bot] to copilot-swe-agent[bot]. Git log shows 882 commits from copilot-swe-agent[bot] and 0 from copilot[bot] in this repo. (commit 7bcc980)

🐛 Additional fix discovered

  1. Fixed gate step for large PRs (300+ files)gh pr diff fails with HTTP 406 for PRs with 300+ changed files (e.g., PR March 30th, Inflight Candidate #34617 inflight candidate). Now falls back to the paginated pulls/files REST API. (commit 04bf387)

📝 Acknowledged (no change needed)

  1. "Approve and run" concern — You're right this shows up for fork contributors with pull_request_target. However, dotnet/maui already has two pull_request_target workflows (dogfood-comment.yml and bump-global-json.yml) — both self-gate to avoid fork triggers. Our workflow is different in that we want it to run on fork PRs, which is why the approval button appears. Fork contributors can alternatively use /evaluate-tests comment or maintainers can use workflow_dispatch to bypass this. We understand your broader direction toward issue_comment-only for org-wide patterns and will track that.

All changes compiled clean with gh aw compile. Running validation now via workflow_dispatch against PR #34882 (small, with tests) and PR #34617 (300+ files, gate fallback test).

Adds hide-older-comments: true to add-comment safe-output.
Previous evaluation comments are automatically minimized (collapsed
as 'outdated') when a new evaluation is posted, reducing PR noise.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen PureWeen requested review from jeffhandley and kubaflo April 9, 2026 17:38
kubaflo
kubaflo previously approved these changes Apr 9, 2026
github.event_name == 'workflow_dispatch' ||
(github.event_name == 'issue_comment' &&
github.event.issue.pull_request &&
startsWith(github.event.comment.body, '/evaluate-tests'))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Today I learned about the slash_command trigger at Command Triggers | GitHub Agentic Workflows. I suggest trying that out here. This could then change the if condition to use needs.activation.outputs.slash_command instead of the comment body.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch -- implemented in e7fdf22. Replaced issue_comment + startsWith with slash_command: evaluate-tests (scoped to events: [pull_request_comment, issue_comment]). Simplified the if: condition since the platform handles command matching now. Also added an anti-patterns table to the gh-aw instructions file so we don't miss built-in features like this in the future.

- Replace manual issue_comment + startsWith with slash_command: trigger
  (auto emoji reactions, sanitized input, eliminates skipped runs)
- Add 'Before You Build' anti-patterns table to gh-aw instructions
  listing 13 manual patterns that have built-in gh-aw equivalents
- Simplify if: condition (platform handles command matching)

Addresses review feedback from @jeffhandley (slash_command suggestion).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add labels: ['pr-review', 'testing'] for gh aw status filtering
- Update Fork PR Behavior table in instructions to document that
  slash_command compiles to issue_comment with platform-managed matching

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen
Copy link
Copy Markdown
Member Author

PureWeen commented Apr 13, 2026

🔍 Multi-Model Code Review — PR #34678

Reviewed by: 3 independent reviewers
Files: 4 changed (workflow source, compiled lock, PowerShell script, instruction docs)
CI Status: ⚠️ maui-pr skipping (expected — no runtime code changes), license/cla ✅, dogfood-comment ✅

Prior Reviews: jeffhandley approved. Jeff's feedback (rename suppress_commentsuppress_output, fix bot identity, disable noop report issues) was addressed per PureWeen's comment.


Findings

🟡 MODERATE — exit 0 on permission denial causes evaluation of wrong code

File: .github/scripts/Checkout-GhAwPr.ps1 (~line 67)
Flagged by: All 3 reviewers (1 initial + 2 confirmed in adversarial round)

if ($Permission -notin $AllowedRoles) {
    Write-Host "⏭️ PR author '...' has '$Permission' access..."
    exit 0   # ← step succeeds, workflow continues
}

When workflow_dispatch targets a PR whose author lacks write access, the script exits 0 (success) without checking out the PR. The workflow continues: the agent runs against the base branch code, but inputs.pr_number is still in context, so it posts an incorrect evaluation comment on the actual PR.

Why it matters: A maintainer triggering workflow_dispatch for a community PR gets a plausible-looking but wrong evaluation posted publicly. No security risk, but a data-integrity bug.

Fix: Change exit 0 to exit 1 so the workflow fails visibly when checkout is skipped.


🟡 MODERATE — github.event.repository.fork guard is a no-op

File: .github/workflows/copilot-evaluate-tests.md (~line 28) and compiled into pre_activation/activation conditions in .lock.yml
Flagged by: 2/3 reviewers

if: >-
  ((!github.event.repository.fork) || github.event_name == 'workflow_dispatch') && ...

github.event.repository refers to the base repo (dotnet/maui), not the PR source. For dotnet/maui, github.event.repository.fork is always false, making !github.event.repository.fork always true — this clause is a no-op.

Why it matters: The guard looks like it's checking whether the PR comes from a fork, but it isn't. Future contributors may either remove a guard they think is redundant or add duplicates. The real fork safety comes from the sandboxed execution model and pre_activation role checks.

Suggestion: Either remove the misleading guard or add a comment clarifying it only matters if someone installs the workflow in a fork of dotnet/maui.


🟢 MINOR — Compiler-generated if: conditions reference dead pull_request event

File: .github/workflows/copilot-evaluate-tests.lock.yml (reaction step ~line 122, status-comment step ~line 170)
Flagged by: All 3 reviewers

if: ... || (github.event_name == 'pull_request') && (...)

The trigger changed from pull_request to pull_request_target, but github.event_name for the new trigger is 'pull_request_target'. These conditions will never match for PR events, so the 👀 reaction and "🔬 Evaluating tests…" status comment will not fire for pull_request_target triggers (only for issue_comment/slash-command).

Why it matters: PR authors get no immediate feedback that evaluation has started. This appears to be a gh-aw compiler limitation — it doesn't yet map pull_request_target to these auto-generated conditions. Consider filing upstream.


🟢 MINOR — isFork fetched but never used as a gate; docs contradict code

File: .github/scripts/Checkout-GhAwPr.ps1 (~line 46, ~line 72)
Flagged by: 2/3 reviewers

The script fetches isCrossRepository into $PrInfo.isFork and logs it, but never uses it to gate checkout. The .DESCRIPTION docstring says "Fork PRs are evaluated via pull_request_target instead" — but the code will check out fork PRs from write-access authors via workflow_dispatch.

Why it matters: Dead code + misleading docs = maintenance confusion. Either enforce the fork check or update the docs to accurately describe behavior ("Fork PRs from write-access authors are checked out here; external contributors are handled via pull_request_target").


🟢 MINOR — Community PRs have limited evaluation paths

File: .github/scripts/Checkout-GhAwPr.ps1 (permission gate)
Flagged by: 2/3 reviewers (1 disagreed, noting /evaluate-tests slash command still works)

The workflow_dispatch path now blocks PRs from authors without write access. Combined with pre_activation role checks on pull_request_target, the auto-trigger path may not work for community PRs.

Mitigated: A maintainer can still trigger evaluation via the /evaluate-tests slash command. But this is a behavior change vs. the previous unconditional workflow_dispatch path — worth documenting.


Discarded Findings (1/3 agreement, adversarial rejected)

Finding Initial Adversarial Verdict
hide-older-comments: true suppresses evaluations 1/3 0/2 agreed Discarded — intentional noise reduction
pre_activation triggers on all issue comments 1/3 N/A (expected with slash_command:) Discarded

Known Limitations (documented in PR)

  • ready_for_review event not supported by gh-aw compiler for pull_request_target — draft→ready transitions without a new push won't trigger. Documented in PR description ✅

Non-Issues Verified ✅

  • No pwn-request vulnerability: pull_request_target migration is architecturally sound. User steps never execute fork code; agent runs in sandboxed container with scrubbed credentials.
  • $PrInfo null guard: Complete and correct — catches empty output and null author.
  • PR_NUMBER injection: Input typed as number — GitHub enforces numeric, no injection risk.
  • copilot-swe-agent[bot] allowlist: Correctly wired via GH_AW_ALLOWED_BOTS env var.
  • COPILOT_TOKEN exposure: Accurately documented with defense layers and limitations.
  • noop report-as-issue disabled: Correctly set to false in both source and compiled output.

Test Coverage

This PR modifies workflow infrastructure (YAML, PowerShell, Markdown). There are no automated tests for gh-aw workflows — validation is done via manual workflow_dispatch runs. PureWeen's comment indicates validation was run against PR #34882 (small) and PR #34617 (300+ files, gate fallback test). This is appropriate for this type of change.


🔄 Re-Review (after commit d142b43)

Commit: d142b430 — "Fix gate step to run for all triggers, bump timeout to 20min"

Changes in latest commit (verified from diff):

  • ✅ Gate step if: condition removed — gate now runs for all triggers (pull_request_target, issue_comment, workflow_dispatch)
  • ✅ Gate PR_NUMBER broadened to ${{ github.event.pull_request.number || github.event.issue.number || inputs.pr_number }}
  • ✅ Timeout bumped from 15 → 20 minutes
  • Gather-TestContext.ps1 instruction updated with -PrNumber <number>

Previous Finding Status

# Finding Status Notes
1 exit 0 on permission denial STILL PRESENT One-line fix needed: exit 0exit 1
2 repository.fork guard no-op STILL PRESENT Cosmetic/dead code
3 Dead pull_request in compiler conditions STILL PRESENT gh-aw compiler limitation
4 isFork not enforced / docs mismatch STILL PRESENT Low risk
5 Community PR evaluation paths MITIGATED /evaluate-tests slash command + pull_request_target auto-trigger

All 3 reviewers confirmed Finding 1 remains the only actionable blocker.

New Observations from d142b43

The gate step improvement is well-designed:

  • PR_NUMBER resolution chain (pull_request.number || issue.number || inputs.pr_number) correctly handles all three trigger types. inputs.pr_number is required: true for workflow_dispatch, so null risk is minimal.
  • Gate for /evaluate-tests on non-test PRs: If someone comments /evaluate-tests on a PR with no test files, the gate will exit 1 with a clear message. This is correct behavior — it prevents wasting 20 minutes of agent compute on a PR that has nothing to evaluate.
  • Timeout bump to 20 min: Reasonable for complex PRs with many test files.

No new issues introduced by this commit. ✅

Recommendation

⚠️ Request changes — Finding 1 (exit 0exit 1 in Checkout-GhAwPr.ps1 line 67) is the only remaining blocker. All other findings are informational (compiler limitations, dead code, documentation improvements) that can be addressed in follow-up.

- Remove if: restriction on gate step — now runs for all events
  (pull_request_target, workflow_dispatch, slash_command)
- Unify PR_NUMBER from all event sources in gate step
- Bump timeout-minutes from 15 to 20 for complex evaluations
- Pass -PrNumber to Gather-TestContext.ps1 in prompt

Fixes: agent evaluated wrong files for workflow_dispatch on
no-test PRs because gate was skipped, causing 15min timeout.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
github-actions bot and others added 2 commits April 13, 2026 19:05
…it code

Gate step: exit 0 instead of exit 1 when no test files found.
Sets HAS_TEST_FILES=false env var for the agent to noop quickly.
PRs without tests now show clean ✅ in GitHub checks instead of ❌.

Checkout script: exit 1 instead of exit 0 on permission denial.
Prevents evaluating wrong code when author lacks write access.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…n check

- Add HAS_TEST_FILES env var check to Checkout step condition
  so workflow_dispatch skips checkout when gate found no tests
- Prevents Checkout-GhAwPr.ps1 from failing on bot-authored PRs
  (app/copilot-swe-agent can't be looked up via collaborator API)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants