Skip to content

fix: improve expert review workflow with adversarial consensus fixes#35220

Closed
PureWeen wants to merge 13 commits intomainfrom
fix/improve-expert-review-workflow
Closed

fix: improve expert review workflow with adversarial consensus fixes#35220
PureWeen wants to merge 13 commits intomainfrom
fix/improve-expert-review-workflow

Conversation

@PureWeen
Copy link
Copy Markdown
Member

@PureWeen PureWeen commented Apr 29, 2026

Note

Are you waiting for the changes in this PR to be merged?
It would be very helpful if you could test the resulting artifacts from this PR and let us know in a comment if this change resolves your issue. Thank you!

Description

Ports validated improvements from dotnet/maui-labs PR #167 to the expert review workflow, and removes unnecessary PowerShell scripts.

Changes

Safe-output hardening:

  • create-pull-request-review-comment: max: 30max: 50, add target: "*" for workflow_dispatch
  • submit-pull-request-review: add target: "*"
  • add-comment: max: 5max: 2 (blast-radius cap — only need 1 summary call)

Adversarial consensus improvements:

  • 2-reviewer mode gate: graceful degradation when a model fails (2/2 include, 1/2 discard)
  • Severity rules fixed: 3/3 uses highest severity (was unspecified), 2/3 same-severity keeps original (prevents lenient reviewer burying critical findings)
  • Post-consensus-zero path: proper handling when findings are raised but all discarded
  • Submit failure fallback: if inline review fails, findings go into summary comment
  • add_comment budget: explicit ONE call limit to prevent duplicate summaries
  • Follow-up data warning: AGREE/DISAGREE responses are internal, never posted

Structural improvements:

  • Step 4 restructured into Part A (inline comments) + Part B (lean summary)
  • Emoji severity labels (🔴🟡🟢) for consistency
  • Batch-split findings explicitly excluded from 1/3 discard rules
  • pull_request_number always passed for target: "*" outputs

Script removal:

  • Removed Checkout-GhAwPr.ps1 (116 lines) — fork/permission checks were redundant for workflow_dispatch (already write-gated by GitHub + roles: config). Essential checkout + restore inlined as ~5 lines of bash in both review-shared.md and copilot-evaluate-tests.md
  • Updated gh-aw-workflows.instructions.md to reflect inline pattern

Fixes:

Testing

Port improvements validated on dotnet/maui-labs PR #167:

- Fix safe-outputs: max:50 inline (was 30), max:2 comments (was 5),
  add target:"*" on all outputs for workflow_dispatch support
- Add 2-reviewer mode gate for graceful degradation when a model fails
- Fix severity rules: 3/3 uses highest severity, 2/3 same-severity
  keeps original (prevents lenient reviewer burying critical findings)
- Add post-consensus-zero exit path for discarded-findings summary
- Add submit failure fallback (includes findings in summary comment)
- Add add_comment budget warning (exactly ONE call per review)
- Add pull_request_number requirement for target:"*" safe-outputs
- Add batch-split exception for large-diff single-reviewer findings
- Add follow-up response warning (internal data, never post)
- Restructure Step 4 into Part A (inline) + Part B (lean summary)
- Fix role name: maintainer → maintain (correct GitHub role)
- Add TODO for gh-aw upgrade path once #28767 is fixed
- Use emoji severity labels (consistent with agent output)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 29, 2026 12:12
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 29, 2026

🚀 Dogfood this PR with:

⚠️ WARNING: Do not do this without first carefully reviewing the code of this PR to satisfy yourself it is safe.

curl -fsSL https://github.com/dotnet/maui/main/eng/scripts/get-maui-pr.sh | bash -s -- 35220

Or

  • Run remotely in PowerShell:
iex "& { $(irm https://github.com/dotnet/maui/main/eng/scripts/get-maui-pr.ps1) } 35220"

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Ports workflow hardening and adversarial-consensus logic improvements into the Expert Code Review gh-aw workflow, primarily by updating the shared orchestration prompt/config and recompiling the lock file.

Changes:

  • Harden safe-outputs configuration (limits/targets) and restructure Step 4 into inline comments + lean summary guidance.
  • Refine consensus rules (2-reviewer fallback behavior, severity handling, post-consensus-zero handling, and submit failure fallback).
  • Fix the configured GitHub role name (maintainermaintain) and update the compiled .lock.yml accordingly.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
.github/workflows/shared/review-shared.md Updates shared gh-aw config + reviewer orchestration instructions (safe-outputs, consensus rules, posting flow).
.github/workflows/review.agent.md Fixes the role name used for workflow gating (maintain).
.github/workflows/review.agent.lock.yml Regenerated lock file reflecting the shared/workflow changes (safe-outputs config, roles, hashes).

target: "*"
add-comment:
max: 5
max: 2
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The safe-output config allows add-comment up to 2 times (max: 2), but the workflow instructions below explicitly require exactly ONE add_comment call per run. To better enforce the intended blast-radius cap and avoid duplicate summaries if the agent makes a mistake, consider setting add-comment.max to 1.

Suggested change
max: 2
max: 1

Copilot uses AI. Check for mistakes.
PureWeen and others added 12 commits April 29, 2026 07:21
The 116-line PowerShell script was redundant for workflow_dispatch:
- Fork checks: workflow_dispatch already requires write access
- Permission checks: roles: config already gates access
- The essential checkout + restore is ~5 lines of bash

Replace with inline bash in review-shared.md and copilot-evaluate-tests.md.
Update gh-aw-workflows.instructions.md to reflect the change.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- add-comment.max: 2 → 1 to match exactly-ONE instruction (3/3 consensus)
- Make .github/ restore fatal in copilot-evaluate-tests.md (2/3 consensus)
- Add fork guard before skill re-overlay in review-shared.md (2/3 after follow-up)
- Add re-overlay documentation note in gh-aw-workflows.instructions.md (2/3 consensus)
- Clarify 2-reviewer discards must appear in summary (2/3 after follow-up)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Make .github/ restore fatal in review-shared.md (was non-fatal, 3/3 🔴)
- Restore .agents/ alongside .github/ in all checkout steps + docs (3/3 🟡)
- Combine two gh pr view calls into one atomic API call (3/3 🟢)
- Remove dead github.event.pull_request.number expression (3/3 🟢)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…e skill)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
.agents/ doesn't exist in this repo yet, so combining both in one
git checkout fails with 'pathspec did not match'. Split into:
- .github/ restore: fatal (must succeed)
- .agents/ restore: soft fallback (may not exist)

Matches the gh-aw guide's recommended pattern.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The review agent reads code via MCP tools (GitHub API), not from
the filesystem. No checkout needed — matches maui-labs approach.
Keep only the workflow-start-time step for the time budget feature.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove checkout step (agent reads PR via MCP tools, not filesystem)
- Add roles: [admin, maintain, write] (was missing)
- Fix cancel-in-progress: true → false (slash_command best practice)
- Update prompt to reference MCP tools instead of local files

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add resolve-pull-request-review-thread safe-output (max: 50) and
instruct the agent to resolve all github-actions[bot] review threads
before posting new findings. Prevents stale findings from previous
runs cluttering the PR.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The agent container has no gh CLI credentials, so it can't query
GraphQL for thread node IDs. Move the query to a pre-agent step
that writes .prior-review-thread-ids, then instruct the agent to
read that file and resolve each thread via safe-output tool.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
resolve_pull_request_review_thread needs pull-requests: write on the
safe_outputs job, but the compiler doesn't grant it. Added to the
TODO list for when gh-aw#28767 is fixed and we can upgrade.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen PureWeen closed this Apr 30, 2026
@PureWeen PureWeen deleted the fix/improve-expert-review-workflow branch April 30, 2026 13:41
PureWeen added a commit that referenced this pull request Apr 30, 2026
<!-- Please let the below note in for people that find this PR -->
> [!NOTE]
> Are you waiting for the changes in this PR to be merged?
> It would be very helpful if you could [test the resulting
artifacts](https://github.com/dotnet/maui/wiki/Testing-PR-Builds) from
this PR and let us know in a comment if this change resolves your issue.
Thank you!

## Description

Removes the Expert Code Review (`/review`) gh-aw workflow. Will be
re-added with improvements from PR #35220.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants