Custom AI Code Review Agents: Automating the Checks Linters Can't
Custom AI code review agents run on every pull request and enforce the team-specific standards that linters and scripts can't — coding principles, architecture diagrams, migration safety, test coverage. A practical guide with real configurations.
Custom AI code review agents are AI-powered checks you define in plain English that run automatically on every pull request, enforcing the team-specific standards that linters, scripts, and generic review tools can't evaluate. In Macroscope these are called Check Run Agents: you write what to check in a markdown file, give the agent access to your codebase and connected tools, and it runs as a proper GitHub check run alongside your tests and built-in AI review.
This is a practical, use-case-driven guide. Catching bugs and security flaws manually on every PR is nearly impossible, and generic automated tools only cover a narrow slice — bugs, vulnerabilities, runtime errors. Modern teams care about much more: coding conventions, architectural consistency, test coverage, migration safety. The challenge is that every team's priorities differ. What matters to a fintech startup is not what matters to a healthcare platform or an e-commerce company. One-size-fits-all tooling can't address that. Custom AI code review agents can.
TL;DR — Custom AI code review agents
- What: AI checks defined in plain English markdown, one file per check, that run on every PR
- Beyond bugs: Enforce coding principles, generate architecture diagrams, gate risky migrations, verify test coverage
- Tool access: Browse code and git history, plus Slack, Sentry, PostHog, LaunchDarkly, BigQuery, Jira, Linear, and any MCP server
- Enforcement levels: Block the merge, warn, or stay neutral (informational) — per agent
- Selective: Scope each agent to the files or paths it should watch, so it stays quiet everywhere else
- Cost: Usage-based. $100 in free usage and 1,000 free agent credits every month per workspace
Beyond Basic Bug Detection
Traditional code review tools focus on a narrow set of problems: bugs, security vulnerabilities, and runtime errors. Those correctness issues are critical — and Macroscope's built-in Correctness check, with Fix It For Me, is built precisely for them. But modern development teams need more than correctness.
They need to enforce coding conventions, maintain architectural consistency, verify test coverage, and ensure database migrations follow best practices. These are judgment calls, not pattern matches. A linter can tell you a variable is unused. It cannot tell you whether a new API endpoint follows your naming convention, whether a migration has a rollback path, or whether a change to the payments flow might break event tracking.
The deeper problem is that every team's priorities are different. Generic tooling optimizes for the average team, which means it under-serves every specific one. Custom AI code review agents flip that: instead of adapting your standards to the tool, you encode your standards into the tool, in the same language you'd use to explain them to a senior engineer.
What Custom AI Code Review Agents Do Differently
The thing that separates a custom AI code review agent from a custom rule in a traditional tool is that the agent can investigate. It does not just run a pattern against the diff and pass or fail. It browses your codebase, reads related files, queries git blame, checks external systems, and reasons about what it finds — all within a single check run.
That investigation is what makes the following capabilities possible:
- Customizable prompts. Define exactly what each agent looks for, from a specific coding pattern to an architectural principle, in natural language.
- Selective execution. Scope an agent to run only when certain files or paths change, so it focuses attention where it matters and stays quiet everywhere else.
- Tool integration. Equip an agent with capabilities like Slack notifications, database queries, or diagram generation so it can do real analysis, not just text matching.
- Flexible enforcement. Set each check to block the merge, warn the developer, or simply provide information.
It runs on advanced reasoning models, so it catches nuanced violations that simple pattern matching would miss. And because the instructions live in a markdown file in your repo, anyone on the team can read, review, and change them — no YAML pipeline, no custom GitHub Action, no ESLint plugin to maintain.
Real-World Applications
The best way to understand custom AI code review agents is to see what teams actually build with them. Here are four high-leverage patterns.
Enforcing Coding Principles
A principles agent monitors for the common mistakes that slip past traditional linters. For example, you might configure an agent to catch developers accidentally mangling a primary key — a subtle but critical error that causes data-integrity issues no regex would flag. The agent reviews every pull request, understands your codebase context, and flags the violation before it reaches production. Because it runs on a reasoning model, it understands the intent of the principle, not just a literal pattern.
---
title: Coding Principles
model: claude-opus-4-6
effort: medium
input: full_diff
tools:
- browse_code
- git_tools
- modify_pr
---
Review this PR against our core data-integrity principles:
- Primary keys must never be reassigned, mutated, or reused. Flag any
code that writes to a PK column after row creation.
- Money is stored in integer cents, never floats. Flag float arithmetic
on currency values.
- Soft-deleted rows must stay excluded from default queries.
For each violation: file, line, the principle broken, and a fix.
If nothing is found, report that all principles passed.
Automated Architecture Diagrams
An architecture-visualization agent analyzes all the changes in a pull request and generates a diagram showing how the touched components interact. This helps human reviewers quickly grasp the scope and impact of a change, especially on a complex PR that touches multiple systems. The agent can post a summary to a Slack channel so the broader team stays informed about significant architectural changes — turning a wall of diff into a picture a reviewer can read in seconds.
Database Migration Safety
Database changes are high-risk. A migration-checklist agent verifies that the developer followed every required step before the PR can merge: a rollback procedure exists, the migration was tested against production-like data volumes, index performance was considered, and breaking changes are documented. Set this one to block the merge when a step is missing, and pair it with branch protection so a risky migration can't slip through on a Friday afternoon.
---
title: Migration Safety
model: claude-opus-4-6
reasoning: high
effort: high
input: full_diff
conclusion: failure
tools:
- browse_code
- git_tools
- modify_pr
---
If this PR adds or changes a database migration, verify ALL of:
1. A rollback / down migration is included.
2. New columns on large tables are nullable or have a safe default
(no blocking table rewrites).
3. New indexes are created concurrently where the engine supports it.
4. Any breaking schema change is documented in the PR description.
If any item is missing, the check MUST fail. List exactly what's missing.
If there is no migration in this PR, pass immediately.
Test Coverage Verification
A unit-test agent analyzes whether new code includes appropriate test coverage, checks that tests follow team conventions, and suggests additional cases based on the code's complexity and risk profile. Unlike a coverage percentage gate, it reasons about which new logic is risky and therefore which missing tests actually matter — so it nudges toward meaningful tests rather than gaming a number.
The Integration Ecosystem
Custom AI code review agents become genuinely powerful when connected to the tools your team already uses. Macroscope ships native integrations:
- Communication: Slack for notifications and team updates
- Project management: Linear and Jira for ticket tracking
- Feature flags: LaunchDarkly for deployment coordination
- Analytics: PostHog and Amplitude for product insights
- Monitoring: Sentry for error tracking
- Cloud and data: BigQuery and GCP cloud logging for production data analysis
Beyond the native set, MCP (Model Context Protocol) connectors let agents reach virtually any tool. Teams commonly wire in monitoring platforms, search tools, and on-call systems through MCP.
The key advantage is that you configure an integration once, then make it available to any agent. Each agent accesses exactly the tools it needs for its specific job. That is what lets a single agent check the diff, look up the last 90 days of Sentry errors for the modified file, verify the relevant LaunchDarkly flag, and post a summary to Slack — all in one check run. No glue scripts, no separate webhooks, no integration to maintain per tool.
Balancing Automation and Flexibility
Not every check should block a pull request. Some findings are informational; others are critical. Custom AI code review agents let you set the enforcement level per agent:
| Level | Behavior | Use it for |
|---|---|---|
| Blocking | The PR can't merge until the issue is resolved | Security, migration safety, mandatory standards |
| Warning | The developer sees the issue but can proceed | Conventions you want followed but not enforced hard |
| Neutral | The check informs without affecting merge status | New agents you're still tuning, low-stakes nudges |
This flexibility supports a sensible rollout: start informational, gather data on the issues that actually recur, and raise enforcement as the patterns become clear. You earn trust in an agent before you let it block anyone.
How Custom AI Code Review Agents Compare to the Alternatives
Most tools that advertise "custom rules" offer configuration, not investigation. The difference shows up the moment a check needs to look beyond the diff.
| Custom AI Code Review Agents | CodeRabbit custom rules | Greptile | Custom GitHub Actions | Linters (ESLint/Semgrep) | |
|---|---|---|---|---|---|
| Definition | Plain-English markdown | YAML config | Learns from PR comments | Code + YAML | Rule files |
| Investigates beyond the diff | Yes — browses code, git, external tools | No | Learned patterns only | Build it yourself | No |
| External integrations | Slack, Sentry, PostHog, LaunchDarkly, BigQuery, Jira, Linear, MCP | None from rules | None from rules | Build it yourself | None |
| Can block merges | Yes | Yes | Comments only | Yes | Yes |
| Maintenance | Edit a markdown file | Edit YAML | Auto (drifts) | Maintain code + prompts | Maintain rules |
| Pricing | Usage-based | Per-seat | Per-seat | Actions minutes + API | Varies |
CodeRabbit's custom rules tell the tool "be stricter in the auth directory." A custom AI code review agent says "for every modified file in auth, look up the last 90 days of Sentry errors, check the relevant feature flag, and post a summary to #security." For teams evaluating CodeRabbit alternatives or Greptile alternatives, that gap — investigation and tool access versus configuration — is usually the deciding factor.
A caveat worth stating plainly: these agents are complementary to static analysis, not a replacement. Deterministic rules — formatting, import order, banned functions — are faster and cheaper with ESLint, Semgrep, or golangci-lint. Reach for an agent when the check requires judgment, context, or a query to another system.
Getting Started
Setting up custom AI code review agents is straightforward:
- Install the GitHub app to connect your repositories. Every workspace gets $100 in free usage and 1,000 free agent credits each month.
- Configure integrations in the Connections tab for any external tools your agents will use (Slack, Sentry, PostHog, and so on — optional, only if you reference them).
- Create your first agent by adding a markdown file to the
.macroscope/directory: define its purpose, scope it to the files it should watch, and write the prompt describing what it checks. - Choose its tools and enforcement level — start neutral or warning while you tune it.
- Test on a few PRs and refine. Agents improve as you sharpen the instructions.
Start simple. A PR-labeling agent or a changelog enforcer is a good first build before you graduate to a merge-blocking security or migration check. For the full configuration reference — every frontmatter field, the complete tool list, and more worked examples — see the Check Run Agents guide, and for the broader concept, what agentic CI is.

Why This Matters in the Age of AI-Generated Code
Custom AI code review agents represent a shift in how teams approach code quality. The old choice was between manual review — thorough but slow — and generic automated tools — fast but inflexible. Now teams can automate their specific standards and practices, in their own words.
The result is faster pull requests, fewer bugs reaching production, and more consistent adherence to team conventions. Developers spend less time on repetitive checks and more on real problems. Senior engineers can encode their hard-won expertise into agents that guide the whole team, so a principle a staff engineer cares about gets enforced on every PR whether or not they're the reviewer.
This matters more as AI coding agents write an increasing share of pull requests. When AI generates the code, you need review with the same depth and judgment a senior engineer would bring — not pattern matching, but investigation against the full context of your codebase, your production systems, and your team's standards. The blank-canvas approach means that as your team's needs evolve, your code review automation evolves with them.
Frequently Asked Questions
What are custom AI code review agents?
Custom AI code review agents are AI-powered checks you define in plain English that run automatically on every pull request. In Macroscope they're called Check Run Agents: you write what to check in a markdown file in your repo's .macroscope/ directory, give the agent access to your codebase and connected tools, and it runs as a GitHub check run. Unlike generic tools that only catch bugs and vulnerabilities, custom agents enforce team-specific standards like coding conventions, architectural consistency, migration safety, and test coverage.
How are custom AI agents different from linters and custom rules?
Linters and custom rules match patterns; AI agents investigate. A linter checks the diff against a fixed pattern and passes or fails. A custom AI code review agent browses your codebase, reads related files, queries git history, checks external systems like Sentry and LaunchDarkly, and reasons about what it finds. CodeRabbit's custom rules, for instance, adjust review behavior via YAML configuration. A custom AI agent can look up the last 90 days of production errors for a modified file and post a summary to Slack — investigation and tool access that configuration alone can't match.
What can custom AI code review agents check beyond bugs?
They can enforce coding principles (like never mutating a primary key), generate architecture diagrams from a PR's changes, gate risky database migrations on a safety checklist, verify meaningful test coverage, enforce API naming conventions, auto-label PRs, and check that production systems are healthy before merge. Anything you could explain to a senior engineer as a review standard, you can encode as an agent — provided it requires judgment or context rather than a simple deterministic pattern.
Can a custom AI code review agent block a pull request from merging?
Yes. Each agent has an enforcement level: blocking (the PR can't merge until the issue is resolved), warning (the developer sees it but can proceed), or neutral (informational only). Set a migration-safety or security agent to blocking and pair it with GitHub branch protection rules, and a PR with a missing rollback or a hardcoded secret can't merge until it's fixed. A good rollout starts agents neutral or warning, then raises enforcement once the patterns are clear.
What tools and integrations can custom AI code review agents use?
By default they can browse code, read git history, and read GitHub PR metadata, and update the PR. Optional integrations include Slack, Sentry, PostHog, Amplitude, LaunchDarkly, BigQuery, GCP cloud logging, and issue trackers like Jira and Linear. Beyond the native set, MCP connectors let agents reach virtually any tool, including monitoring platforms, search tools, and on-call systems. You configure an integration once and make it available to any agent.
How much do custom AI code review agents cost?
Pricing is usage-based, not per-seat. Custom agents bill under Macroscope's Agent credits at $0.01 per credit, and every workspace gets 1,000 free agent credits each month — enough to run agents on most teams' full PR volume before any agent billing begins. New workspaces also start with $100 in free usage. You control spend by scoping agents tightly with file patterns and choosing an appropriate model and effort level for each check.
Are custom AI agents a replacement for static analysis like ESLint or Semgrep?
No — they're complementary. Static analysis tools are fast and deterministic, ideal for pattern-based rules like import ordering, formatting, and banned functions. Custom AI code review agents handle what static analysis can't: business-logic validation, cross-system verification (checking Sentry errors or feature-flag states), contextual judgment, and natural-language standards that resist formalization as regex or AST rules. The strongest setups use both — linters for deterministic rules, agents for investigative checks.
How do I create my first custom AI code review agent?
Install the Macroscope GitHub app, then add a markdown file to the .macroscope/ directory in your repository root. Give it optional YAML frontmatter (model, effort, input mode, tools, enforcement level), then write the review instructions in plain English. Commit to your default branch and the agent runs on the next pull request. Start simple — a PR labeler or changelog enforcer — before building merge-blocking checks, and refine the instructions based on the agent's output.
