Skip to main content

Agent Build Guide

Code Review Agent

Automated PR review that flags bugs, security gaps, and convention violations before human reviewers

Intermediate ⏱ 20 minutes

By The Codegen Team · Updated June 2026

What you’ll need

Skills

Tools

The problem

Pull requests stack up faster than your team can review them. Each PR sits in the queue for hours while reviewers context-switch from their own work, spending 20 to 30 minutes on issues a machine could flag in seconds. The bottleneck creates merge conflicts, stale branches, and a team that ships slower every sprint.

Build guide

Step 1: Install the Code Review Skill

The /code-review command ships with Claude Code and works immediately. For the enhanced community version with language-specific review guides for 20+ frameworks, install it to your personal skills directory:

git clone https://github.com/awesome-skills/code-review-skill.git ~/.claude/skills/code-review

Start a new Claude Code session and verify the install:

/skills

You should see code-review in the list. If you skipped the community version, the built-in /code-review still appears as an available command.

Step 2: Configure Your CLAUDE.md

Your CLAUDE.md file tells the review agent what conventions matter in your codebase. If you do not have one yet, start with the config on the CLAUDE.md configuration guide and customize it for your stack.

Add a Review Standards section that defines the checks specific to your team:

## Review Standards

- Flag any function longer than 50 lines
- Flag any file with more than 3 levels of nesting
- Flag missing error handling on async operations
- Flag hardcoded values that should be environment variables
- Check that new API routes have input validation

The /code-review skill reads these rules and includes them in its review criteria alongside its default bug and security checks.

Step 3: Run Your First Review

Create a feature branch, make some changes, and run the review:

/code-review

The skill analyzes your diff against the base branch and outputs findings grouped by severity. Critical issues appear first, followed by warnings and suggestions. Each finding includes the file path, line number, and a description of the problem.

To have the agent fix the mechanical issues automatically:

/code-review --fix

Review the applied changes with git diff before committing. The –fix flag handles straightforward problems like unused imports and missing error handling. It leaves complex issues for you to address manually.

Step 4: Post Findings to GitHub (Optional)

To post review findings as inline comments on a GitHub pull request, install the gh CLI and authenticate:

brew install gh
gh auth login

/code-review --comment

Each finding appears as an inline comment on the exact line where the issue was found. The PR also gets a summary comment listing all findings ranked by severity.

For teams that want automated reviews on every PR without manual invocation, Anthropic offers a Code Review product for Team and Enterprise plans that triggers automatically when a PR is opened.

Step 5: Tune the Review After 10 PRs

After running the agent on about 10 pull requests, you will know where it helps and where it generates noise. Tune it in two ways.

First, adjust the confidence threshold. The default is 80 out of 100. If you get too many low-confidence findings, raise it to 90. If you want to catch more edge cases, lower it to 70.

Second, update your CLAUDE.md. Every false positive the agent produces is a signal that your conventions are missing a rule. If the agent keeps flagging your intentional use of default exports in the /pages/ directory, add that exception to your CLAUDE.md conventions.

What to expect

This agent catches style violations, unused variables, missing error handling, security anti-patterns, and CLAUDE.md convention violations reliably. It handles the mechanical first pass so human reviewers can focus on design and logic decisions.

It does not catch architectural problems, business logic errors, or performance bottlenecks that require understanding the broader system. It also produces lower-quality results on diffs over 500 lines where context limits reduce accuracy. Treat it as a filter, not a replacement for human review.

Frequently Asked Questions

Build faster with AI-powered agents

See how Codegen automates the full development workflow — from ticket to pull request.

Get Started →