Skip to content

PR Agent

Shane Neuville edited this page Apr 20, 2026 · 10 revisions

The PR Agent validates fixes through automated testing, explores alternatives using multiple AI models, and synthesizes everything into actionable recommendations.

%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1e1e2e', 'primaryTextColor': '#cdd6f4', 'primaryBorderColor': '#45475a', 'lineColor': '#6c7086', 'secondaryColor': '#313244', 'tertiaryColor': '#181825'}}}%%
flowchart LR
    subgraph gate ["🧪 GATE"]
        direction TB
        G1[Detect tests in PR]
        G2[Verify tests fail without fix]
        G3[Verify tests pass with fix]
        G1 --> G2 --> G3
    end
    
    subgraph review ["🤖 PR REVIEW"]
        direction TB
        R1[Pre-Flight: gather context]
        R2[Try-Fix: 4 models sequentially]
        R3[Report: write recommendation]
        R1 --> R2 --> R3
    end
    
    subgraph post ["📊 POST"]
        direction TB
        P1[Post AI summary comment]
        P2[Apply agent labels]
        P1 --> P2
    end
    
    gate --> review --> post
    
    style gate fill:#1e1e2e,stroke:#89b4fa,stroke-width:2px,color:#cdd6f4
    style review fill:#1e1e2e,stroke:#cba6f7,stroke-width:2px,color:#cdd6f4
    style post fill:#1e1e2e,stroke:#a6e3a1,stroke-width:2px,color:#cdd6f4
Loading

Every fix is tested. The agent doesn't theorize—it implements each approach, runs tests, and reports what works.


Quick Start

Recommended: Use Plan Mode First

For the best results, start in plan mode to create and review a detailed plan before execution:

  1. Enter plan mode: Press Shift+Tab or use /plan
  2. Request a review plan:
    /plan review PR #12345 - create a detailed plan for the review
    
  3. Review the plan: Copilot will create a structured plan. Review the steps and make adjustments.
  4. Exit plan mode: Press Shift+Tab to switch back to execution mode
  5. Execute the plan:
    proceed with the plan
    

Direct Invocation (Alternative)

copilot

# Ask it to review a PR
please review PR #12345

Trigger Phrases

Phrase Description
"Review PR #XXXXX" Review an existing PR with independent analysis
"Work on PR #XXXXX" Investigate and implement a fix
"Fix issue #XXXXX" Works whether or not a PR exists

How It Works

The pipeline is orchestrated by Review-PR.ps1:

.\Review-PR.ps1 -PRNumber 33687
.\Review-PR.ps1 -PRNumber 33687 -Platform ios

Step 0: Branch Setup

Creates a review branch from main and squash-merges the PR onto it. If there are merge conflicts, posts a comment on the PR and exits.

Step 1: Gate

Runs verify-tests-fail.ps1 directly (no Copilot agent — pure script):

  1. Detects tests in the PR diff via Detect-TestsInDiff.ps1
  2. Verifies tests fail without the fix (baseline)
  3. Verifies tests pass with the fix applied

Results:

  • PASSED — tests catch the bug ✅
  • SKIPPED — no tests detected in PR (recommends @copilot write tests for this PR)
  • FAILED — tests didn't behave as expected ❌

The gate result is posted as a PR comment and passed as context to Step 2.

Step 2: PR Review

Invokes Copilot CLI with the prompt "Use a skill to review PR #XXXXX", which triggers the pr-review skill. This runs three phases:

Pre-Flight — Reads the linked issue, PR description, and comments. Classifies changed files. No code analysis — just context gathering. Output: pre-flight/content.md

Try-Fix (mandatory) — Four models explore independent fix ideas sequentially:

Order Model
1 Claude Opus 4.6
2 Claude Sonnet 4.6
3 GPT-5.3-Codex
4 Gemini-3-Pro-Preview

Each model generates an independent fix (without seeing the PR's solution first), implements it, and runs tests. Between attempts, the baseline is restored via EstablishBrokenBaseline.ps1 -Restore.

After all 4 attempts, cross-pollination rounds ask each model for new ideas based on all results. Repeats until all say "NO NEW IDEAS" (max 3 rounds).

The best passing fix is selected by comparing simplicity, robustness, and codebase consistency. Output: try-fix/content.md

Report — Writes the final recommendation (APPROVE or REQUEST CHANGES) with a comparison table of all fix candidates. Output: report/content.md

Step 3: Post AI Summary

Runs post-ai-summary-comment.ps1 to post the review as a PR comment combining gate result, try-fix comparison, and recommendation.

Step 4: Apply Labels

Runs Update-AgentLabels.ps1 to parse the phase output files and apply labels:

Label Meaning
s/agent-reviewed PR was reviewed (always applied)
s/agent-approved Agent recommends approval
s/agent-changes-requested Agent recommends changes
s/agent-review-incomplete Agent couldn't complete all phases
s/agent-gate-passed Tests catch the bug
s/agent-gate-failed Could not verify tests catch the bug
s/agent-fix-win Agent found a better fix
s/agent-fix-pr-picked PR's fix was best
s/agent-fix-implemented (Manual) Author adopted agent's suggestion

Output Structure

All phase output is written to CustomAgentLogsTmp/PRState/{PRNumber}/PRAgent/:

gate/content.md           ← Gate result
pre-flight/content.md     ← PR context and file classification
try-fix/content.md        ← Fix comparison table
  attempt-{N}/            ← Per-model attempt details
report/content.md         ← Final recommendation

When NOT to Use

Task Use Instead
Just run tests manually Sandbox Agent
Only write tests Write Tests Agent
Extract lessons from a completed PR Learn From PR Agent

Related

Clone this wiki locally