Improve evaluate-pr-tests workflow: pull_request_target, access gating, security docs#34678
Improve evaluate-pr-tests workflow: pull_request_target, access gating, security docs#34678
Conversation
- Change trigger from pull_request to pull_request_target so fork PRs have access to secrets (COPILOT_GITHUB_TOKEN) - Add roles: all to allow fork contributors (who have read permission) to trigger the workflow - Remove forks: ["*"] (not needed with pull_request_target) - Remove ready_for_review type (not supported by gh-aw for pull_request_target) - Update if condition and gate step to reference pull_request_target Validated on PureWeen/maui: - Same-repo PR: all green (run 23603776593) - Fork PR via workflow_dispatch: all green (run 23605610535) - Fork PR via pull_request_target: all green (run 23606033617) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
🚀 Dogfood this PR with:
curl -fsSL https://github.com/dotnet/maui/main/eng/scripts/get-maui-pr.sh | bash -s -- 34678Or
iex "& { $(irm https://github.com/dotnet/maui/main/eng/scripts/get-maui-pr.ps1) } 34678" |
There was a problem hiding this comment.
Pull request overview
Updates the gh-aw Evaluate PR Tests workflow triggers so fork PRs can be evaluated (by switching from pull_request to pull_request_target) while keeping the workflow’s gating/conditions aligned with the new event.
Changes:
- Switched workflow trigger from
pull_requesttopull_request_targetand updated relatedif:conditions. - Updated the gate step condition to run under
pull_request_target. - Updated the compiled
.lock.ymlto reflect the new trigger configuration.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
.github/workflows/copilot-evaluate-tests.md |
Moves evaluation to pull_request_target and updates workflow conditions/gate accordingly. |
.github/workflows/copilot-evaluate-tests.lock.yml |
Regenerated compiled workflow reflecting the trigger/condition changes. |
| pull_request_target: | ||
| paths: | ||
| - src/**/tests/** | ||
| - src/**/test/** | ||
| types: | ||
| - opened | ||
| - synchronize | ||
| - reopened | ||
| - ready_for_review | ||
| # roles: all # Roles processed as role check in pre-activation job | ||
| workflow_dispatch: |
There was a problem hiding this comment.
The compiled lock file shows roles: all commented out under the on: block, which indicates the roles setting was not applied. This also coincides with the removal of the pre_activation/membership check job, so the workflow no longer gates who can run it. Fix by moving roles: to the correct top-level location in the .md frontmatter and re-compiling so the lock file includes the intended role check behavior.
| needs: pre_activation | ||
| if: > | ||
| (needs.pre_activation.outputs.activated == 'true') && ((github.event_name == 'pull_request' && github.event.pull_request.draft == false) || github.event_name == 'workflow_dispatch' || (github.event_name == 'issue_comment' && | ||
| (github.event_name == 'pull_request_target' && github.event.pull_request.draft == false) || github.event_name == 'workflow_dispatch' || (github.event_name == 'issue_comment' && |
There was a problem hiding this comment.
jobs.activation no longer depends on a role/membership gate (needs: pre_activation and the needs.pre_activation.outputs.activated check are gone). With pull_request_target, this means the workflow can run with secrets for any matching PR/comment, which is a significant security/cost exposure. After fixing the roles placement in the .md, ensure the compiled lock restores the gating (or add an explicit guard) before allowing activation to proceed.
| (github.event_name == 'pull_request_target' && github.event.pull_request.draft == false) || github.event_name == 'workflow_dispatch' || (github.event_name == 'issue_comment' && | |
| (github.event_name == 'pull_request_target' && github.event.pull_request.draft == false && github.event.pull_request.head.repo.fork == false) || github.event_name == 'workflow_dispatch' || (github.event_name == 'issue_comment' && |
The workflow_dispatch step runs with GITHUB_TOKEN and checks out PR code. Restrict it to only process PRs from authors with write/maintain/admin access, preventing checkout of untrusted fork code in a privileged context. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move the PR author permission check from inline workflow bash into the shared Checkout-GhAwPr.ps1 script. Any gh-aw workflow using this script now automatically gates on the PR author having write/maintain/admin access before checking out code. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Fork PRs are handled by pull_request_target (platform checkout in sandboxed container). The workflow_dispatch path should only process same-repo PRs from authors with write access. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Restoring only skills/, instructions/, and copilot-instructions.md left other .github/ subdirs (pr-review/, scripts/, workflows/) from the PR branch. Restore the entire .github/ directory for complete coverage. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Instead of deleting .github/ and restoring from main, merge the base branch into the PR branch after checkout. This produces the same state as a pull_request merge commit: PR changes + latest main. If the PR modifies a skill, the PR version wins; otherwise main's version is used. This lets contributors iterate on skills via workflow_dispatch while keeping everything else current. On merge conflict, falls back to the PR branch as-is with a warning. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- pull_request_target: only auto-runs for OWNER/MEMBER/COLLABORATOR - issue_comment: /evaluate-tests only accepted from OWNER/MEMBER/COLLABORATOR - workflow_dispatch: unchanged - External PRs require maintainer /evaluate-tests comment to trigger Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Revert merge strategy to targeted git checkout (works in shallow clones) - Remove roles:all, restore gh-aw pre_activation with write-level checks - Remove author_association from if: (gh-aw handles access gating) - Update fork fallback message to remove stale workflow_dispatch advice Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add suppress_comment input for workflow_dispatch dry-run (evaluate without posting comment) - Add explicit noop guidance so the agent uses it instead of silently exiting - Update posting results section to respect dry-run mode Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Prevents silent fork check bypass when gh returns empty/malformed JSON — $null.isFork evaluates to $false in PowerShell, which would let the fork check pass incorrectly. Note: ready_for_review cannot be added to pull_request_target types yet — gh-aw compiler doesn't include it in the allowed type list. Filed as a known gap. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Fix inaccurate claim that agent has 'no ability to access secrets' — COPILOT_TOKEN is present via --env-all, defended by firewall + redaction + threat detection - Add Security Boundaries section with principles from GitHub Security Lab's pwn-request guidance - Add defense layers table documenting what each layer does/doesn't do - Add explicit rules for workflow authors (DO/DON'T) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot-authored PRs are created by copilot[bot] which doesn't have write collaborator access. The bots: allowlist lets the pre_activation membership check pass for this known bot actor. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move permission check before fork check — fork PRs from collaborators with write access should be checked out and evaluated. Only block PRs from authors without write access (exit 0, not exit 1 — it's a skip, not an error). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| If there is nothing to evaluate (PR has no test files, PR is a docs-only change, etc.), you **must** call the `noop` tool with a message explaining why: | ||
|
|
||
| ```json | ||
| {"noop": {"message": "No action needed: [brief explanation, e.g. 'PR contains no test files']"}} |
There was a problem hiding this comment.
You might want to configure to not generate a no-op run report issue (within the frontmatter).
https://github.github.com/gh-aw/patterns/monitoring/#no-op-run-reports
| ### Key Principles (from [GitHub Security Lab](https://securitylab.github.com/resources/github-actions-preventing-pwn-requests/)) | ||
|
|
||
| 1. **Never execute untrusted PR code with elevated credentials.** The classic "pwn-request" attack is `pull_request_target` + checkout PR + run build scripts with `GITHUB_TOKEN`. The attack surface includes build scripts (`make`, `build.ps1`), package manager hooks (`npm postinstall`, MSBuild targets), and test runners. | ||
|
|
||
| 2. **Treating PR contents as passive data is safe.** Reading, analyzing, or diffing PR code is fine — the danger is *executing* it. Our gh-aw workflows read code for evaluation; they never build or run it. | ||
|
|
||
| 3. **`pull_request_target` grants write permissions and secrets access.** This is by design — the workflow YAML comes from the base branch (trusted). But any step that checks out and runs fork code in this context creates a vulnerability. | ||
|
|
||
| 4. **`pull_request` from forks has no secrets access.** GitHub withholds secrets because the workflow YAML comes from the fork (untrusted). This is the safe default for CI builds on fork PRs. | ||
|
|
||
| 5. **The `workflow_run` pattern separates privilege from code execution.** Build in an unprivileged `pull_request` job → pass artifacts → process in a privileged `workflow_run` job. This is architecturally what gh-aw does: agent runs read-only, `safe_outputs` job has write permissions. |
| pull_request_target: | ||
| types: [opened, synchronize, reopened] |
There was a problem hiding this comment.
I do think this will still lead to the 'Approve and run workflows' button showing up for PRs from untrusted forks. We need to solidify the guidance we give for when to hit that button. I really wish that button navigated into a list of workflows needing approval for the PR with boxes to select which to approve.
| suppress_comment: | ||
| description: 'Dry-run — evaluate but do not post a comment on the PR' |
There was a problem hiding this comment.
Future-proofing: I suggest renaming to suppress_output in case the output changes (to a PR review for example).
| suppress_comment: | |
| description: 'Dry-run — evaluate but do not post a comment on the PR' | |
| suppress_output: | |
| description: 'Dry-run - evaluate but do not post output on the PR' |
| type: boolean | ||
| default: false | ||
| bots: | ||
| - "copilot[bot]" |
There was a problem hiding this comment.
I am not certain what identity ends up getting used here; I experimented and I think it's one of these but not sure which. 😄
- copilot
- copilot[bot]
- app/copilot-swe-agent
- copilot-swe-agent
- copilot-swe-agent[bot]
| if: >- | ||
| (github.event_name == 'pull_request' && github.event.pull_request.draft == false) || | ||
| (github.event_name == 'pull_request_target' && github.event.pull_request.draft == false) || | ||
| github.event_name == 'workflow_dispatch' || | ||
| (github.event_name == 'issue_comment' && | ||
| github.event.issue.pull_request && |
There was a problem hiding this comment.
I always guard against forks as well, preventing the workflow from running on forks except for the workflow_dispatch event. Otherwise, PRs within a fork will result in failing workflow runs (vs. starting the workflow and skipping all jobs).
Simple case that needs adapting to your scenario: if: (!github.event.repository.fork) || github.event_name == 'workflow_dispatch'.
| When triggered via `workflow_dispatch` with `suppress_comment` = `${{ inputs.suppress_comment }}`: | ||
| - If **true**, perform the full evaluation but **do not** post a comment on the PR. Write the evaluation to the workflow log only. This is useful for testing the skill without spamming the PR. | ||
| - If **false** (default), post the comment as normal. |
There was a problem hiding this comment.
These expressions get replaced on their way to the model so this would end up embedding the true or false value into the opening statement. I think you need something more like this (but I recommend validating my understanding here). Note I also reflected my suggested input rename from above.
| When triggered via `workflow_dispatch` with `suppress_comment` = `${{ inputs.suppress_comment }}`: | |
| - If **true**, perform the full evaluation but **do not** post a comment on the PR. Write the evaluation to the workflow log only. This is useful for testing the skill without spamming the PR. | |
| - If **false** (default), post the comment as normal. | |
| When triggered via `workflow_dispatch`, the `suppress_output` input controls behavior. | |
| - If `${{ inputs.suppress_output }}` == **true**, perform the full evaluation but **do not** post a comment on the PR. Write the evaluation to the workflow log only. This is useful for testing the skill without spamming the PR. | |
| - If `${{ inputs.suppress_output }}` == **false** (default), post the comment as normal. |
| ## Posting Results | ||
|
|
||
| Call `add_comment` with `item_number` set to the PR number. Wrap the report in a collapsible `<details>` block: | ||
| If dry-run mode is active (`suppress_comment` is true), log the evaluation report to stdout and stop — do **not** call `add_comment`. |
There was a problem hiding this comment.
| If dry-run mode is active (`suppress_comment` is true), log the evaluation report to stdout and stop — do **not** call `add_comment`. | |
| If dry-run mode is active (`suppress_output` is true), log the evaluation report to stdout and stop — do **not** call `add_comment`. |
…ports, dry-run wording - Rename suppress_comment to suppress_output (future-proofs output type changes) - Disable no-op run report issues (report-as-issue: false) - Improve dry-run mode wording to clarify expression replacement behavior Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The correct identity for Copilot-authored PRs is copilot-swe-agent[bot] (882 commits in this repo), not copilot[bot] (0 commits). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
gh pr diff fails with HTTP 406 for PRs with 300+ changed files. Fall back to paginated REST API (pulls/files) when diff is too large. Fixes the failure seen on PR #34617 (inflight candidate with 300+ files). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Addressing Jeff's Review FeedbackThanks for the thorough review @jeffhandley! Here's what we applied: ✅ Applied
🐛 Additional fix discovered
📝 Acknowledged (no change needed)
All changes compiled clean with |
Adds hide-older-comments: true to add-comment safe-output. Previous evaluation comments are automatically minimized (collapsed as 'outdated') when a new evaluation is posted, reducing PR noise. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| github.event_name == 'workflow_dispatch' || | ||
| (github.event_name == 'issue_comment' && | ||
| github.event.issue.pull_request && | ||
| startsWith(github.event.comment.body, '/evaluate-tests')) |
There was a problem hiding this comment.
Today I learned about the slash_command trigger at Command Triggers | GitHub Agentic Workflows. I suggest trying that out here. This could then change the if condition to use needs.activation.outputs.slash_command instead of the comment body.
There was a problem hiding this comment.
Great catch -- implemented in e7fdf22. Replaced issue_comment + startsWith with slash_command: evaluate-tests (scoped to events: [pull_request_comment, issue_comment]). Simplified the if: condition since the platform handles command matching now. Also added an anti-patterns table to the gh-aw instructions file so we don't miss built-in features like this in the future.
- Replace manual issue_comment + startsWith with slash_command: trigger (auto emoji reactions, sanitized input, eliminates skipped runs) - Add 'Before You Build' anti-patterns table to gh-aw instructions listing 13 manual patterns that have built-in gh-aw equivalents - Simplify if: condition (platform handles command matching) Addresses review feedback from @jeffhandley (slash_command suggestion). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add labels: ['pr-review', 'testing'] for gh aw status filtering - Update Fork PR Behavior table in instructions to document that slash_command compiles to issue_comment with platform-managed matching Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Multi-Model Code Review — PR #34678Reviewed by: 3 independent reviewers Prior Reviews: jeffhandley approved. Jeff's feedback (rename Findings🟡 MODERATE —
|
| Finding | Initial | Adversarial | Verdict |
|---|---|---|---|
hide-older-comments: true suppresses evaluations |
1/3 | 0/2 agreed | Discarded — intentional noise reduction |
pre_activation triggers on all issue comments |
1/3 | N/A (expected with slash_command:) |
Discarded |
Known Limitations (documented in PR)
ready_for_reviewevent not supported by gh-aw compiler forpull_request_target— draft→ready transitions without a new push won't trigger. Documented in PR description ✅
Non-Issues Verified ✅
- No pwn-request vulnerability:
pull_request_targetmigration is architecturally sound. User steps never execute fork code; agent runs in sandboxed container with scrubbed credentials. $PrInfonull guard: Complete and correct — catches empty output and null author.PR_NUMBERinjection: Input typed asnumber— GitHub enforces numeric, no injection risk.copilot-swe-agent[bot]allowlist: Correctly wired viaGH_AW_ALLOWED_BOTSenv var.COPILOT_TOKENexposure: Accurately documented with defense layers and limitations.noopreport-as-issue disabled: Correctly set tofalsein both source and compiled output.
Test Coverage
This PR modifies workflow infrastructure (YAML, PowerShell, Markdown). There are no automated tests for gh-aw workflows — validation is done via manual workflow_dispatch runs. PureWeen's comment indicates validation was run against PR #34882 (small) and PR #34617 (300+ files, gate fallback test). This is appropriate for this type of change.
🔄 Re-Review (after commit d142b43)
Commit: d142b430 — "Fix gate step to run for all triggers, bump timeout to 20min"
Changes in latest commit (verified from diff):
- ✅ Gate step
if:condition removed — gate now runs for all triggers (pull_request_target, issue_comment, workflow_dispatch) - ✅ Gate
PR_NUMBERbroadened to${{ github.event.pull_request.number || github.event.issue.number || inputs.pr_number }} - ✅ Timeout bumped from 15 → 20 minutes
- ✅
Gather-TestContext.ps1instruction updated with-PrNumber <number>
Previous Finding Status
| # | Finding | Status | Notes |
|---|---|---|---|
| 1 | exit 0 on permission denial |
❌ STILL PRESENT | One-line fix needed: exit 0 → exit 1 |
| 2 | repository.fork guard no-op |
❌ STILL PRESENT | Cosmetic/dead code |
| 3 | Dead pull_request in compiler conditions |
❌ STILL PRESENT | gh-aw compiler limitation |
| 4 | isFork not enforced / docs mismatch |
❌ STILL PRESENT | Low risk |
| 5 | Community PR evaluation paths | ✅ MITIGATED | /evaluate-tests slash command + pull_request_target auto-trigger |
All 3 reviewers confirmed Finding 1 remains the only actionable blocker.
New Observations from d142b43
The gate step improvement is well-designed:
PR_NUMBERresolution chain (pull_request.number || issue.number || inputs.pr_number) correctly handles all three trigger types.inputs.pr_numberisrequired: trueforworkflow_dispatch, so null risk is minimal.- Gate for
/evaluate-testson non-test PRs: If someone comments/evaluate-testson a PR with no test files, the gate willexit 1with a clear message. This is correct behavior — it prevents wasting 20 minutes of agent compute on a PR that has nothing to evaluate. - Timeout bump to 20 min: Reasonable for complex PRs with many test files.
No new issues introduced by this commit. ✅
Recommendation
exit 0 → exit 1 in Checkout-GhAwPr.ps1 line 67) is the only remaining blocker. All other findings are informational (compiler limitations, dead code, documentation improvements) that can be addressed in follow-up.
- Remove if: restriction on gate step — now runs for all events (pull_request_target, workflow_dispatch, slash_command) - Unify PR_NUMBER from all event sources in gate step - Bump timeout-minutes from 15 to 20 for complex evaluations - Pass -PrNumber to Gather-TestContext.ps1 in prompt Fixes: agent evaluated wrong files for workflow_dispatch on no-test PRs because gate was skipped, causing 15min timeout. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…it code Gate step: exit 0 instead of exit 1 when no test files found. Sets HAS_TEST_FILES=false env var for the agent to noop quickly. PRs without tests now show clean ✅ in GitHub checks instead of ❌. Checkout script: exit 1 instead of exit 0 on permission denial. Prevents evaluating wrong code when author lacks write access. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…n check - Add HAS_TEST_FILES env var check to Checkout step condition so workflow_dispatch skips checkout when gate found no tests - Prevents Checkout-GhAwPr.ps1 from failing on bot-authored PRs (app/copilot-swe-agent can't be looked up via collaborator API) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Description
Overhauls the
copilot-evaluate-testsgh-aw workflow for better security, fork support, and Copilot bot compatibility.Changes
pull_requesttopull_request_target— runs workflow YAML from base branch (trusted), eliminates "Approve & Run" friction for collaboratorscopilot[bot]tobots:allowlist — Copilot-authored PRs auto-trigger evaluation without needing write collaborator statussuppress_commentinput) — evaluate without posting comments, useful for testingnooptool guidance — agent callsnoopwhen no action is needed instead of silently exitingCheckout-GhAwPr.ps1— null guard on$PrInfo, allow fork PRs from write-access authorsTrigger Behavior
pull_request_targetsrc/**/tests/**copilot[bot]; blocked for external contributors viapre_activationrole checkissue_comment/evaluate-testscomment on a PRworkflow_dispatchAccess Matrix
pull_request_target/evaluate-testscommentworkflow_dispatchbots:allowlist)pre_activationblockspre_activationblockspre_activationblockspre_activationblocksSecurity Model
Based on GitHub Security Lab guidance:
contents: read, issues: read, pull-requests: read) ✅safe_outputsjob (not the agent) ✅permissions: {}at workflow level — no ambient write access ✅max: 1comment via safe-outputs ✅Checkout-GhAwPr.ps1restores.github/skills/and.github/instructions/from base branch forworkflow_dispatch✅Security Docs Update
Updated
.github/instructions/gh-aw-workflows.instructions.mdwith:COPILOT_TOKENpresent in agent env via--env-all, defended by firewall + redaction + threat detection)Known Limitations
ready_for_reviewevent type not supported by gh-aw compiler forpull_request_target— draft→ready transitions without a new push won't trigger the workflowcheckout_pr_branch.cjsoverwrites.github/skills/forpull_request_target/issue_commenttriggers — accepted residual risk (agent sandboxed, output limited)