Your team has 15 engineers. Three of them use Claude Code daily. The other twelve are on Windows laptops locked down by corporate IT, or they’re frontend folks who never touched a terminal config in their lives. When a PR lands, you want every change reviewed — not just when a Claude user happens to be around.
This is exactly what headless mode is for.
claude -p (the --print flag) runs Claude Code non-interactively, takes your prompt and context, produces output, and exits. No TTY required. No approval prompts. No human in the loop. You wire it into GitHub Actions, pass it the PR diff, and get review comments posted automatically — whether or not a single person on your team has Claude Code installed.
Here’s how to build that pipeline.
The Core Idea: Passing a Diff to Claude
The fundamental pattern is simple:
- Fetch the PR diff in your GitHub Actions job
- Pass it to
claude -palongside a review prompt - Capture the output
- Post it as a PR comment
The diff contains everything Claude needs to understand what changed. You don’t need Claude to check out the branch or run the code — just the diff, the prompt, and a clear set of review criteria.
# The simplest version
git diff origin/main...HEAD | claude -p "Review this diff for bugs and security issues" \
--output-format text \
--max-turns 3
That’s the seed. Everything below builds on this.
Setting Up the GitHub Actions Workflow
Here’s a complete workflow that triggers on PR open and synchronize events, runs Claude review, and posts the result as a comment:
# .github/workflows/claude-pr-review.yml
name: Claude PR Review
on:
pull_request:
types: [opened, synchronize, reopened]
permissions:
contents: read
pull-requests: write
jobs:
review:
runs-on: ubuntu-latest
timeout-minutes: 15
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0 # Full history needed for accurate diffs
- name: Install Claude Code
run: npm install -g @anthropic-ai/claude-code
- name: Get PR diff
id: diff
run: |
git fetch origin ${{ github.base_ref }}
DIFF=$(git diff origin/${{ github.base_ref }}...HEAD -- . \
':(exclude)*.lock' \
':(exclude)*.min.js' \
':(exclude)dist/**' \
':(exclude)*.snap')
echo "DIFF<<EOF" >> $GITHUB_OUTPUT
echo "$DIFF" >> $GITHUB_OUTPUT
echo "EOF" >> $GITHUB_OUTPUT
- name: Run Claude review
id: review
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: |
REVIEW=$(echo "${{ steps.diff.outputs.DIFF }}" | claude -p \
"You are a code reviewer. Review the following git diff carefully.
Focus on:
1. Correctness bugs — logic errors, off-by-one errors, null pointer risks
2. Security issues — injection risks, auth bypasses, unsafe data handling
3. Performance problems — N+1 queries, unnecessary allocations, blocking calls
4. Coding standards — naming, error handling, missing tests for new behavior
Be direct. Skip praise. For each issue, state:
- The file and approximate line
- What the problem is
- Why it matters
- A concrete fix suggestion
If the diff looks clean, say so briefly and mention one thing that's done particularly well.
Diff to review:
$(cat)" \
--output-format text \
--max-turns 5)
echo "REVIEW<<EOF" >> $GITHUB_OUTPUT
echo "$REVIEW" >> $GITHUB_OUTPUT
echo "EOF" >> $GITHUB_OUTPUT
- name: Post review comment
uses: actions/github-script@v7
with:
script: |
const review = `## Claude Code Review\n\n${{ steps.review.outputs.REVIEW }}\n\n---\n*Automated review by Claude Code headless mode*`;
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
body: review
});
A few things worth noting here:
fetch-depth: 0 is required. Without full history, git diff origin/main...HEAD will fail or produce wrong output. Shallow clones (the default) only fetch the commit you’re on.
The diff excludes lockfiles and build artifacts. Reviewing lockfile changes is noise. The :(exclude) pathspecs filter them out before Claude sees anything.
--output-format text disables interactive mode. Without this flag, Claude Code detects there’s no TTY and may error. This is mandatory for any CI environment.
Handling Large Diffs
PRs that touch hundreds of files will hit token limits. Claude has a context window, and a 10,000-line diff will exceed it. A few strategies for dealing with this:
Strategy 1: Chunk by file type
Review different parts of the codebase separately:
# Review only backend changes
git diff origin/main...HEAD -- 'src/api/**' 'src/services/**' | \
claude -p "Review this backend diff for security and correctness" \
--output-format text --max-turns 3
# Review only frontend changes
git diff origin/main...HEAD -- 'src/components/**' 'src/pages/**' | \
claude -p "Review this frontend diff for accessibility and UX issues" \
--output-format text --max-turns 3
Strategy 2: Stat-first triage
Get the diff stat first, then selectively expand files that look risky:
# Get stat summary
STAT=$(git diff origin/main...HEAD --stat)
# Get full diff for files with the most changes (likely highest risk)
TOP_FILES=$(git diff origin/main...HEAD --stat | \
grep -v "changed" | sort -t'|' -k2 -rn | head -5 | \
awk '{print $1}')
for file in $TOP_FILES; do
git diff origin/main...HEAD -- "$file" | \
claude -p "Review this file's changes: $file" \
--output-format text --max-turns 2
done
Strategy 3: Set a line limit and warn
If the diff exceeds a threshold, skip the full review and post a note instead:
DIFF_LINES=$(git diff origin/main...HEAD | wc -l)
if [ "$DIFF_LINES" -gt 2000 ]; then
echo "PR diff is too large for automated review ($DIFF_LINES lines). Manual review required."
else
# Run Claude review
fi
Large PRs are usually a code smell anyway. A comment from the bot saying “this PR is too big to review automatically” is actually useful signal.
What to Ask Claude to Review
The quality of the review depends almost entirely on your prompt. Vague prompts produce vague reviews. Specific criteria produce specific findings.
Security-focused review
Review this diff for security vulnerabilities.
Check specifically:
- SQL injection: any string interpolation in database queries
- XSS: user input rendered without sanitization
- Authentication: endpoints missing auth middleware
- Secrets: any hardcoded credentials, API keys, or tokens
- SSRF: any URL construction from user input
- Path traversal: file operations using user-supplied paths
- Dependency risks: new npm/pip packages with known CVEs
For each finding, rate severity: Critical / High / Medium / Low.
Skip Low severity unless there are no higher ones.
Performance-focused review
Review this diff for performance issues.
Check specifically:
- N+1 queries: loops that trigger database calls
- Missing indexes: new queries filtering on non-indexed columns
- Synchronous blocking: I/O operations on the main thread (Node.js) or event loop
- Memory leaks: event listeners or intervals not cleaned up
- Unnecessary re-renders: React state changes that cause full subtree renders
- Large bundle additions: any new imports that significantly increase bundle size
If you can't tell without runtime data, say so — don't guess.
Coding standards review
Review this diff against our team's standards.
Our conventions:
- Functions over 30 lines should be broken up
- All async functions must have try-catch or propagate errors explicitly
- No console.log in production code (use our logger)
- React components must have prop types or TypeScript interfaces
- New utility functions need unit tests in the adjacent __tests__ folder
- Database queries must go through the repository layer, not called directly from controllers
Flag violations with the specific rule they break.
You can also combine these. The tradeoff is cost — more criteria means more tokens and more turns.
Cost Management with --max-turns
Each review run costs API tokens. On a large team with frequent PRs, this adds up fast. The most important control is --max-turns.
Claude Code is designed for iterative, multi-step workflows. For a PR review, it doesn’t need many turns — it reads the diff, forms a response, done. In practice, 3-5 turns is sufficient for almost all review tasks.
# Tight budget — 3 turns max
run: |
claude -p "$REVIEW_PROMPT" \
--output-format text \
--max-turns 3
# Medium budget — useful for complex diffs with follow-up analysis
run: |
claude -p "$REVIEW_PROMPT" \
--output-format text \
--max-turns 8
Other cost controls:
Run only on specific labels. Don’t review every trivial PR. Require a label to trigger Claude:
on:
pull_request:
types: [labeled]
jobs:
review:
if: github.event.label.name == 'needs-review'
Use concurrency limits. If five PRs open simultaneously, you probably don’t want all five running Claude in parallel:
concurrency:
group: claude-review-${{ github.repository }}
cancel-in-progress: false
# Note: don't cancel-in-progress for reviews — you want all PRs reviewed,
# just not simultaneously
Add a monthly spend alert in the Anthropic console. Set a budget limit so a runaway workflow can’t drain your account overnight.
Avoiding Review Fatigue: Filtering Noise
Claude will find issues in every diff if you let it. On a codebase with existing technical debt, every PR might get 20 comments. That’s noise, and engineers will start ignoring the bot.
Two approaches that work:
Limit scope to changed lines only. Your prompt should reinforce this:
Only flag issues in the lines that were changed in this diff (marked with +).
Do not comment on pre-existing code that wasn't modified.
Require actionable findings only:
Only include findings where you can suggest a specific fix.
If you see a potential issue but aren't sure, skip it rather than adding speculation.
Set severity thresholds:
Report Critical and High severity issues only.
Skip Medium and Low unless the PR has no higher-severity findings.
The goal is a review that engineers actually read. Five sharp findings beat twenty vague ones.
Posting Structured Comments
Instead of one big comment, you can post structured feedback using GitHub’s PR review API:
- name: Post structured review
uses: actions/github-script@v7
with:
script: |
// Create a review with a summary
await github.rest.pulls.createReview({
owner: context.repo.owner,
repo: context.repo.repo,
pull_number: context.issue.number,
body: `## Automated Review Summary\n\n${reviewSummary}`,
event: 'COMMENT' // Use 'REQUEST_CHANGES' to block merge on critical issues
});
The event parameter is worth thinking about:
'COMMENT'— posts informational feedback, doesn’t block merge'APPROVE'— approves the PR (probably don’t do this automatically)'REQUEST_CHANGES'— blocks merge until dismissed
For most teams, 'COMMENT' is the right default. Blocking merges automatically based on Claude’s output requires high confidence in your prompts and a low false-positive rate — which takes iteration to get right.
Related Reading
Once you have automated PR review running, the next step is usually tightening what Claude can do inside your CI environment:
- Claude Code in GitHub Actions: How to Fix Permission Errors (3 Battle-Tested Patterns) — the
.claude/settings.jsonallowlist approach is directly applicable here for locking down what Claude can do in your CI jobs - Claude Code Hooks: 12 Real-World Automation Patterns 2026 — if you want Claude running hooks during the review (linting, security scanning) rather than just analyzing the diff
The headless review pattern scales well. Once it’s running, you can layer in more sophisticated analysis — architecture feedback, test coverage gaps, API contract drift — without adding any burden on your reviewers.