Check Run Agents — Custom AI Checks for GitHub Pull Requests

April 7, 2026

Macroscope

Product

Check Run Agents: Custom AI Checks That Actually Understand Your Codebase

Q: How do I create a Check Run Agent?

Create a `.md` file in the `.macroscope/` directory at your repository root. Add optional YAML frontmatter to configure the model, effort level, input mode, and tools. Write your review instructions in natural language below the frontmatter. Commit to your default branch. The agent starts running on the next pull request.

Q: What tools can Check Run Agents use?

Check Run Agents have default access to `browse_code` (file tree, search), `git_tools` (log, blame, diff, grep), `github_api_read_only` (issues, labels, PR metadata), and `modify_pr` (update PR, post comments). Optional tools include `slack`, `sentry`, `posthog`, `launchdarkly`, `bigquery`, `amplitude`, `gcp_cloud_logging`, `issue_tracking_tools` (Jira/Linear), `web_tools`, and `mcp` (any MCP-compatible server).

Q: Can Check Run Agents block PR merges?

Yes. Set `conclusion: failure` in the agent's frontmatter. When the agent finds critical issues, the check run fails in GitHub's Checks UI. Combined with GitHub branch protection rules that require check runs to pass, this blocks merges until issues are resolved. The default `conclusion: neutral` reports findings without blocking.

Q: What AI models do Check Run Agents use?

The default model is **Claude Opus 4.6**. You can also select `claude-opus-4-5`, `claude-opus-4-7`, `claude-sonnet-4-5`, `claude-sonnet-4-6`, `gpt-5-2`, `gpt-5-4`, or `gpt-5-5` via the `model` frontmatter field. The `reasoning` field controls extended thinking depth (`off`, `low`, `medium`, `high`). Use higher reasoning for complex security reviews and nuanced judgment calls, and Sonnet at low reasoning for cheap, simple checks. Because pricing is usage-based, you only pay for the model and reasoning depth you actually use.

Check Run Agents are fully customizable AI agents that trigger on every pull request. Define what to check in plain English, give them access to your codebase, git history, Slack, Sentry, and more — and they run as proper GitHub check runs.

Check Run Agents are fully customizable AI agents from Macroscope that trigger on every GitHub pull request. You define what to check in a markdown file using plain English, give them access to your codebase, git history, and connected integrations like Slack, Sentry, and PostHog, and they run as proper GitHub check runs in the Checks tab. Check Run Agents represent a new category of CI tooling: agentic CI — where your checks can investigate, reason, and take action rather than just execute scripts.

TL;DR — What are Check Run Agents?

What: AI-powered GitHub check runs you define in plain English markdown files

Where: .macroscope/ directory in your repo root — one .md file per check

Trigger: Every PR open, push, and manual rerun

Tools: Browse code, git history, Sentry, Slack, PostHog, LaunchDarkly, BigQuery, Jira, Linear, MCP, and more

Output: GitHub Checks tab, inline PR comments, and top-level PR comments

Cost: Free during beta. Will bill under Agent usage when GA

Access: Beta — contact support@macroscope.com

What Are Check Run Agents?

Check Run Agents are AI agents that run as GitHub check runs on every pull request. Each agent is defined by a single markdown file in your repository's .macroscope/ directory. You write what the agent should check in natural language — the same way you'd explain a review standard to a senior engineer — and Macroscope runs that agent automatically on every PR.

Check Run Agents are part of Macroscope's AI code review platform — the best AI code reviewer for GitHub pull requests. Macroscope already runs two built-in check runs on every GitHub PR review:

Correctness — catches runtime bugs, logic errors, and regressions. Includes Fix It For Me, which automatically opens fix PRs for detected issues — making Macroscope both an AI code reviewer and an AI code fixer.
Approvability — evaluates whether the PR is safe to merge and can auto-approve safe PRs

Check Run Agents let you add unlimited custom checks on top of these built-in checks. Each agent runs independently and reports its findings directly in the GitHub Checks UI — the same place your CI tests, linters, and deployment checks appear.

The key difference between Check Run Agents and traditional CI checks is that Check Run Agents can investigate. They don't just run a script and pass/fail. They browse your codebase, query git blame, read related files, check Sentry for production errors, verify feature flags in LaunchDarkly, and post summaries to Slack — all within a single check run.

Key Benefits of Check Run Agents

Enforce standards that require judgment. Handle review criteria like "check Sentry for active errors on modified files" or "verify new API endpoints have documentation" that linters and scripts cannot evaluate.
Cross-reference external systems automatically. Query Sentry errors, PostHog analytics, LaunchDarkly feature flags, and Slack notifications within a single check run without custom webhooks or glue scripts.
Update standards instantly. Change your team's conventions by editing a markdown file, not by modifying YAML pipelines, custom GitHub Actions, or ESLint plugins.
Replace multiple tools with one agent. A single Check Run Agent can replace separate linters, labeling actions, Slack webhooks, and Sentry integration scripts.
Block merges on critical issues. Use conclusion: failure to enforce mandatory standards through GitHub branch protection rules.
Usage-based pricing that aligns with workload. Check Run Agents bill under Macroscope's Agent usage meter when GA. You pay for the work the agents actually do, not per developer per month. New workspaces start with $100 in free usage and the beta itself is free.

Why Do Engineering Teams Need Check Run Agents?

Every engineering team has standards that existing tools can't enforce. Check Run Agents exist to fill three gaps in the current CI landscape:

Gap 1: Standards that require judgment. "If a PR touches the payments flow, check Sentry for active errors." "If someone modifies the API schema, make sure the changelog is updated." "If a new React component is added, verify it has accessible labels." These are judgment calls. Linters can't make them. Scripts are too brittle to maintain. Check Run Agents handle them naturally.

Gap 2: Cross-system verification. Modern code review doesn't happen in isolation. You need to cross-reference Sentry errors, feature flag states, analytics events, production logs, and issue trackers. Check Run Agents have native access to all of these through their tool system. No custom webhooks. No glue scripts. No separate integrations to maintain.

Gap 3: Standards that evolve. When your team's conventions change — and they always do — you update a markdown file. Not a YAML pipeline. Not a custom GitHub Action. Not an ESLint plugin. The agent reads the new instructions on the next PR.

How Do Check Run Agents Work?

Check Run Agents work in three steps: define, trigger, and report.

Step 1: Define the Agent

Create a .md file in .macroscope/ at your repository root. The filename determines the check name. For example, .macroscope/security-review.md creates a check run called "Security Review" in your GitHub Checks tab.

Each file has two parts:

Frontmatter (optional) — YAML configuration controlling the model, effort level, tools, and scoping
Instructions (required) — Plain English description of what the agent should check

Step 2: The Agent Triggers

When a pull request is opened, updated (push), or manually rerun, Macroscope reads every .md file in .macroscope/ on your default branch and launches an AI agent for each one. Each agent receives the PR diff and begins investigating according to your instructions.

Step 3: The Agent Reports

Check Run Agent results appear in three places:

Output Location	What Shows Up	Best For
Check run details (Checks tab)	Full investigation report	Comprehensive findings, tables, summaries
Inline PR comments	Line-level annotations on diff	Specific code issues with file + line reference
PR issue comments	Top-level comments on the PR	Broader findings, notifications, summaries

Prerequisites

Before creating your first Check Run Agent, make sure you have:

A GitHub account with access to the repository you want to add agents to
Macroscope installed on your GitHub organization or repository (sign up at macroscope.com)
Write access to the repository's default branch (to commit the .macroscope/ directory)
Connected integrations in Macroscope for any external tools you want agents to access (Slack, Sentry, PostHog, LaunchDarkly, etc.) — optional, only required if you reference those tools

What Configuration Options Do Check Run Agents Support?

Check Run Agents support the following frontmatter configuration fields:

Field	Default	Options	What It Controls
`title`	Filename-derived	Max 60 chars	Display name in GitHub Checks UI
`model`	`claude-opus-4-6`	`claude-opus-4-5`, `claude-opus-4-6`, `claude-opus-4-7`, `claude-sonnet-4-5`, `claude-sonnet-4-6`, `gpt-5-2`, `gpt-5-4`, `gpt-5-5`	AI model powering the agent
`reasoning`	`low`	`off`, `low`, `medium`, `high`	Extended thinking depth
`effort`	`low`	`low`, `medium`, `high`	How deeply the agent investigates
`input`	`full_diff`	`full_diff`, `code_object`	How the PR diff is processed
`tools`	Default set	See tools list below	Agent capabilities and integrations
`exclude`	none	Glob patterns	Files to skip (e.g., `".go"`, `"tests/*"`)
`conclusion`	`neutral`	`neutral`, `failure`	Maximum severity — `failure` can block merges

Input modes explained:

full_diff — One agent processes the entire PR diff. Lower cost. Best for PR-level checks like "is the changelog updated?" or "do all new endpoints have docs?"
code_object — Up to 20 agents run in parallel, one per changed code object (function, class, method). Higher cost. Best for per-unit enforcement like "does every new function have a docstring?"

Supported AI Models

Check Run Agents support the following AI models, configurable via the model frontmatter field:

claude-opus-4-6 (default). Best balance of capability and cost for most review tasks.
claude-opus-4-5. Previous-generation Opus model.
claude-opus-4-7. Latest Opus model for maximum capability on complex reviews.
claude-sonnet-4-5. Faster, lower-cost option for simpler checks.
claude-sonnet-4-6. Updated Sonnet model.
gpt-5-2. OpenAI baseline option.
gpt-5-4. Higher-capability GPT option.
gpt-5-5. Latest GPT model.

Use the reasoning field (off, low, medium, high) to control extended thinking depth. Higher reasoning is recommended for complex security reviews and nuanced judgment calls. Since pricing is usage-based, you can pick the cheapest model and reasoning level that does the job and only pay for the depth you actually use.

What Tools Can Check Run Agents Access?

Check Run Agents have access to a rich set of tools that make them truly agentic. This is the core differentiator — no other AI code review tool gives custom checks this level of integration access.

Default Tools (Always Available)

Tool	Capability
`browse_code`	Explore file tree, read files, search by filename or content
`git_tools`	Git log, blame, diff, grep — full git history access
`github_api_read_only`	Read issues, labels, PR metadata, commit statuses
`modify_pr`	Update PR title/description/labels/assignees, post line-level review comments

Optional Integration Tools

Tool	Requires	What the Agent Can Do
`slack`	Slack connection	Post messages and findings to channels
`sentry`	Sentry connection	Check for active errors related to modified code
`posthog`	PostHog connection	Query product analytics data
`launchdarkly`	LaunchDarkly connection	Check feature flag states and targeting
`bigquery`	BigQuery connection	Run queries against your data warehouse
`amplitude`	Amplitude connection	Query product analytics
`gcp_cloud_logging`	GCP connection	Search production logs
`issue_tracking_tools`	Jira or Linear	Read and create issues
`web_tools`	None	Fetch and parse web pages
`mcp`	MCP server connection	Connect to any MCP-compatible server

This tool access is what makes Check Run Agents fundamentally different from custom rules in CodeRabbit, Qodo, or Greptile. Those tools can check your diff against a pattern. A Check Run Agent can check your diff, look up the last 90 days of Sentry errors for the modified file, verify the relevant feature flag in LaunchDarkly is targeting the correct users, query PostHog for the conversion impact of the changed flow, and post a summary to your team's Slack channel — all in one check run.

Check Run Agent Examples

Example 1: Web Team Standards Review

This Check Run Agent enforces frontend standards, checks Sentry for production errors on modified files, auto-labels PRs, and notifies Slack when critical issues are found:

---
title: Web Review
model: claude-opus-4-6
effort: medium
input: full_diff
tools:
  - browse_code
  - git_tools
  - modify_pr
  - slack
  - sentry
exclude:
  - "*.go"
  - "*.proto"
  - "schema/**"
  - "services/**"
---

Review this PR against our web team's standards:

## Event Tracking
If this PR touches payment flows, signup funnels, analytics calls,
CTA buttons, or redirect logic, check whether it could break event
tracking. Rate each issue: 🔴 will stop firing, 🟡 may fire
incorrectly, 🟢 low risk.

## Accessibility
Check new or modified React components for basic accessibility:
- Images must have alt text
- Buttons and links must have accessible labels
- Form inputs must have associated labels

## Production Errors
For each file modified, check Sentry for unresolved issues. If any
active errors exist, list them with frequency and last seen date.

## Labels
Add labels to this PR based on what changed:
- "frontend" if any UI components are modified
- "styles" if CSS or styled-components changed
- "docs" if only markdown files changed

## Notifications
If any 🔴/🟡 event tracking issues or accessibility violations are
found, post a summary to #eng on Slack with the PR link.

If nothing noteworthy is found, report that all checks passed.

What this single Check Run Agent replaces:

A custom ESLint plugin for event tracking patterns
An accessibility linter (which still can't check semantic labeling contextually)
A Sentry integration script in your CI pipeline
A GitHub Action for auto-labeling
A Slack webhook for notifications

That is five separate tools replaced by one markdown file.

Example 2: API Contract Enforcement

This Check Run Agent verifies API changes are backward-compatible and properly documented:

---
title: API Contract Check
model: claude-opus-4-6
effort: high
input: full_diff
tools:
  - browse_code
  - git_tools
  - modify_pr
exclude:
  - "*.css"
  - "*.mdx"
  - "tests/**"
---

Check this PR for API contract compliance:

1. If any API endpoint is added or modified, verify the OpenAPI spec
   is updated to match.
2. If request/response types changed, check whether existing clients
   would break (backward compatibility).
3. If a new endpoint is added, verify it follows our naming convention
   (kebab-case paths, plural resource names).
4. Check that all new endpoints have rate limiting middleware applied.

Format findings as a table:
| File | Issue | Severity | Suggestion |

Example 3: Security Review with Merge Blocking

This Check Run Agent performs a security review and blocks merges when critical issues are found:

---
title: Security Review
model: claude-opus-4-6
reasoning: high
effort: high
input: full_diff
conclusion: failure
tools:
  - browse_code
  - git_tools
  - modify_pr
---

Perform a security review of this PR:

- Check for hardcoded secrets, API keys, or credentials
- Verify authentication middleware on new routes
- Check for SQL injection, XSS, or command injection vectors
- Verify input validation on user-facing endpoints
- Flag any new dependencies and check for known vulnerabilities

If any HIGH severity issue is found, the check MUST fail.
Format: 🔴 HIGH / 🟡 MEDIUM / 🟢 LOW with file paths and line numbers.

The conclusion: failure setting is critical here — it makes the check run fail in GitHub's Checks UI. Combined with branch protection rules, this blocks merges until the security issues are resolved.

Limitations of Check Run Agents

Check Run Agents are powerful, but understanding their boundaries helps you use them effectively:

Not a replacement for static analysis. Deterministic rules (formatting, import order, banned functions) are faster and cheaper with traditional linters like ESLint, Semgrep, or golangci-lint.
Cannot execute arbitrary code. Agents investigate and report. They cannot run your test suite, execute shell scripts, or directly modify files in the repository.
Do not automatically fix code. Check Run Agents identify issues. For automatic fixes, use Macroscope's Fix It For Me on Correctness findings.
Require clear instructions for best results. Vague instructions like "check for security issues" produce less useful output than specific criteria.
Cost scales with complexity. High effort levels, code_object input mode, and extensive tool usage increase agent costs. Because Macroscope is usage-based, the right way to control spend is to scope tightly and pick the lowest model/effort that does the job, not to cap seats.
External tool access requires connected integrations. Agents can only query Sentry, Slack, PostHog, etc. if those integrations are configured in Macroscope.

How Do Check Run Agents Compare to Other Tools?

Feature Comparison Matrix

	Check Run Agents	GitHub Agentic Workflows	CodeRabbit Custom Rules	Greptile	Semgrep / Static Analysis	Custom GitHub Actions
Definition format	Markdown (natural language)	YAML workflows	YAML config	Implicit learning	Rule files / config	Code + YAML
Autonomous investigation	Full codebase browsing	Via Actions context	Config-based only	Learned patterns	Pattern matching	Build it yourself
External integrations	10+ built-in (Slack, Sentry, PostHog, LaunchDarkly, BigQuery, Jira, Linear, MCP)	Via marketplace / scripts	None	None	None	Build it yourself
Native code review context	Diff, codebase graph, history	Must reconstruct	Yes	Yes	Limited	Must reconstruct
Can block merges	Yes (`conclusion: failure`)	Yes	Yes	No, comments only	Yes	Yes
Setup time	Minutes	Hours	Minutes	Minutes	Hours	Hours to days
Maintenance	Update markdown	Update YAML + code	Update YAML	None (auto-learns)	Update rules	Update code + prompts
Infrastructure	Managed (Macroscope)	GitHub Actions minutes	Managed	Managed	Self-managed or managed	Self-managed
Pricing model	Usage-based ($0.05/KB code review + Agent usage meter for Check Run Agents on GA, free in beta)	Actions minutes	Seat-based ($24-30/seat)	Seat-based ($30/seat + per-overage)	Varies	Actions + API costs
Best for	PR review enforcement with real tool access	General automation	Review behavior config	Teams wanting zero config	Deterministic patterns	Full customization

Check Run Agents vs. GitHub Agentic Workflows

	Check Run Agents	GitHub Agentic Workflows
Purpose	Pull request review enforcement	General repository automation
Definition format	Markdown (`.macroscope/*.md`)	YAML (`.github/workflows/`)
Runs on	Macroscope infrastructure	GitHub Actions (consumes minutes)
Code review context	Native access to diff, codebase graph, review history	Must reconstruct from Actions context
External integrations	10+ built-in (Slack, Sentry, PostHog, etc.)	Via Actions marketplace or custom scripts
Status	Beta	Technical preview

GitHub Agentic Workflows are powerful for broad repository automation — issue triage, documentation updates, CI failure analysis. Check Run Agents are purpose-built for PR review enforcement with native code review context.

Check Run Agents vs. CodeRabbit Custom Rules

	Check Run Agents	CodeRabbit Custom Rules
Definition	Natural language instructions per agent	YAML config file (`.coderabbit.yaml`)
Agent behavior	Autonomous investigation — browses code, queries git, calls external services	Configuration-based — adjusts review behavior per path
External integrations	Slack, Sentry, PostHog, LaunchDarkly, BigQuery, Jira, Linear, MCP	None from custom rules
Actions	Post comments, add labels, update PR, send Slack messages	Post review comments
Granularity	Unlimited agents per repo, each with different tools/scope	One config file per repo

CodeRabbit's custom rules tell the tool "be stricter about security in the auth/ directory." Check Run Agents say "for every modified file in auth/, look up the last 90 days of Sentry errors, cross-reference with the deployment log, check the relevant LaunchDarkly flag state, and post a summary to #security on Slack."

Check Run Agents vs. Custom GitHub Actions

	Check Run Agents	Custom GitHub Actions
Setup time	Minutes (write markdown)	Hours to days (write/maintain code)
Maintenance	Update markdown file	Update code, prompts, error handling, model APIs
Infrastructure	Managed by Macroscope	Self-managed (Actions compute, API keys, secrets)
Model routing	Automatic (choose via frontmatter)	Manual (manage API keys, handle model changes)
Tool orchestration	Built-in (10+ integrations)	Build it yourself
Cost management	Built into Macroscope billing	Track separately (Actions minutes + model API costs)

You can build custom AI review logic in GitHub Actions. Open-source projects like claude-pr-reviewer and PR-Agent take this approach. The trade-off is ongoing maintenance of infrastructure, prompt engineering, model selection, output formatting, and error handling. With Check Run Agents, you maintain a markdown file. Macroscope handles everything else.

Check Run Agents vs. Greptile

	Check Run Agents	Greptile
Custom checks	Unlimited agents, each with full tool access	Learns from team's PR comments over time
External integrations	Slack, Sentry, PostHog, LaunchDarkly, BigQuery, Jira, Linear, MCP	None from custom rules
Definition	Explicit markdown instructions per check	Implicit learning from reviewer behavior
Determinism	Same instructions = consistent enforcement	Behavior drifts as it learns new patterns
GitHub integration	Native check runs in Checks tab	PR comments only
Pricing	Usage-based (free in beta)	Seat-based

Greptile takes a different approach — it learns your team's standards by observing PR comments over time. The upside is zero configuration. The downside is you can't explicitly define what gets checked, and the learned behavior can drift. Check Run Agents give you explicit, auditable control: each agent's instructions are a markdown file in your repo that anyone on the team can read, review, and update.

For teams evaluating Greptile alternatives or CodeRabbit alternatives for GitHub PR review, Check Run Agents offer a fundamentally different approach: explicit agentic checks with tool access rather than implicit pattern learning or configuration toggles.

Check Run Agents vs. Semgrep and Static Analysis

Check Run Agents are complementary to static analysis, not a replacement. Semgrep, ESLint, and golangci-lint are fast, deterministic, and excellent for pattern-based rules. Use them for what they're good at — import ordering, no console.log in production, no eval(), formatting enforcement.

Check Run Agents handle what static analysis cannot:

Business logic validation — "Does this payment flow handle all currency edge cases?"
Cross-system verification — "Are there active Sentry errors for this file?"
Contextual judgment — "Is this architectural change consistent with the team's migration plan?"
Natural-language standards — "Does this PR follow our API naming conventions?"

The best engineering setups use both: static analysis for deterministic rules, Check Run Agents for investigative checks.

How Do Check Run Agents Fit into GitHub Code Review?

Check Run Agents integrate directly into the GitHub pull request review workflow that your team already uses. When a developer opens a GitHub PR, Macroscope's AI code review runs automatically — the built-in Correctness and Approvability checks plus any custom Check Run Agents you've defined. Results appear in the same Checks tab as your CI tests, linting, and deployment checks.

This means your GitHub code review process becomes a three-layer system:

AI code review (Macroscope built-in) — Catches runtime bugs, logic errors, and evaluates merge readiness across the full codebase graph. This is the best AI code reviewer for catching issues that span multiple files and functions.
Custom Check Run Agents — Enforce team-specific standards, cross-reference external systems, and automate judgment calls that no linter or CI script can handle.
Human review — Engineers focus on architecture, design decisions, and business logic — the high-value work that AI can't replace.

For teams searching for the best AI code review tool or evaluating CodeRabbit alternatives and Greptile alternatives, Check Run Agents are the key differentiator. No other GitHub code reviewer gives you this level of customizable, agentic enforcement with integrated access to Sentry, Slack, PostHog, and your issue tracker.

How to Write Effective Check Run Agent Instructions

The quality of a Check Run Agent's output depends on the quality of your instructions. Here are the patterns that work:

Be specific, not vague. "Check for security issues" is too broad. "Check for SQL injection in any function that takes user input and constructs a database query" gives the agent a clear target.

Define severity levels explicitly. "🔴 means this will break in production. 🟡 means it might cause issues under specific conditions. 🟢 means it's a suggestion for improvement."

Scope aggressively with exclude. A web review agent doesn't need to process Go backend files. An API contract check doesn't need to look at CSS. Use exclude to keep agents focused and costs low.

Permit "nothing found" reports. Explicitly tell the agent it's okay to report that all checks passed. Without this, agents sometimes stretch to find issues that aren't there.

Use markdown headings to organize. Each heading becomes a distinct investigation area. The agent treats ## Event Tracking and ## Accessibility as separate tasks within the same check run.

Don't duplicate Correctness. Macroscope's built-in Correctness check already catches runtime bugs and logic errors. Your custom Check Run Agents should focus on team-specific standards.

Describe output format. Want a markdown table? A checklist? Emoji-coded severity? Tell the agent. "Format findings as a table with columns: File, Issue, Severity, Suggestion."

What Is Agentic CI?

Agentic CI is a new approach to continuous integration where checks can investigate, reason, and take action — not just execute scripts. Traditional CI is procedural: run this script, check this condition, pass or fail. Agentic CI is investigative: here's what I care about, go figure out if this PR is safe.

Check Run Agents are the first implementation of agentic CI for pull request review. Instead of encoding standards as code (fragile, expensive to maintain, limited to pattern matching), you express standards as natural language instructions and give the agent the tools to enforce them.

This matters especially as coding agents produce an increasing share of pull requests. When AI writes the code, you need AI that can review it with the same depth and judgment a senior engineer would bring. Not pattern matching. Not rule checking. Actual investigation with access to the full context of your codebase, your production systems, and your team's standards.

Check Run Agents are how you encode your team's institutional knowledge into your CI pipeline — in plain English, with the tools to actually enforce it.

How to Get Started with Check Run Agents

Getting started with Check Run Agents takes less than five minutes:

Sign up for Macroscope at macroscope.com and install the GitHub App. Every workspace gets $100 in free usage.
Create a .macroscope/ directory in your repository root.
Add your first agent — start simple. A PR labeling agent or changelog enforcer is a good first agent.
Commit to your default branch — the agent starts running on the next PR automatically.
Iterate — refine instructions based on the agent's output. Check Run Agents improve as you sharpen your instructions.

Check Run Agents are currently in beta. Contact support@macroscope.com or book a demo for access.

Frequently Asked Questions

What are Check Run Agents?

Check Run Agents are fully customizable AI agents from Macroscope that run as GitHub check runs on every pull request. You define what to check in a markdown file inside .macroscope/ using plain English instructions, and the agent investigates the PR diff, browses your codebase, queries external tools like Sentry, Slack, and PostHog, and reports findings in the GitHub Checks tab, as inline PR comments, and as top-level PR comments.

How do I create a Check Run Agent?

Create a .md file in the .macroscope/ directory at your repository root. Add optional YAML frontmatter to configure the model, effort level, input mode, and tools. Write your review instructions in natural language below the frontmatter. Commit to your default branch. The agent starts running on the next pull request.

What tools can Check Run Agents use?

Check Run Agents have default access to browse_code (file tree, search), git_tools (log, blame, diff, grep), github_api_read_only (issues, labels, PR metadata), and modify_pr (update PR, post comments). Optional tools include slack, sentry, posthog, launchdarkly, bigquery, amplitude, gcp_cloud_logging, issue_tracking_tools (Jira/Linear), web_tools, and mcp (any MCP-compatible server).

Can Check Run Agents block PR merges?

Yes. Set conclusion: failure in the agent's frontmatter. When the agent finds critical issues, the check run fails in GitHub's Checks UI. Combined with GitHub branch protection rules that require check runs to pass, this blocks merges until issues are resolved. The default conclusion: neutral reports findings without blocking.

How are Check Run Agents different from GitHub Agentic Workflows?

GitHub Agentic Workflows are general-purpose repository automation defined in YAML that runs on GitHub Actions infrastructure. Check Run Agents are purpose-built for pull request review enforcement — defined in markdown, running on Macroscope's infrastructure (no Actions minutes consumed), with native access to the code review context including the diff, codebase graph, and review history.

How are Check Run Agents different from CodeRabbit custom rules?

CodeRabbit custom rules adjust the tool's review behavior via a .coderabbit.yaml configuration file — for example, being stricter about security in certain directories. Check Run Agents are autonomous AI investigators that can browse code, query git history, call external services like Sentry, Slack, PostHog, and LaunchDarkly, and take actions like adding labels, posting to Slack, or creating Jira issues. Each Check Run Agent is a full AI agent, not a configuration toggle.

How are Check Run Agents different from Semgrep or ESLint?

Check Run Agents are complementary to static analysis tools like Semgrep and ESLint, not a replacement. Static analysis is fast and deterministic — ideal for pattern-based rules like import ordering and formatting. Check Run Agents handle what static analysis cannot: business logic validation, cross-system verification (checking Sentry errors, PostHog analytics, LaunchDarkly flags), contextual judgment, and natural-language standards that resist formalization as regex or AST rules.

How much do Check Run Agents cost?

Check Run Agents are currently free during beta. When generally available, they will be billed under Macroscope's Agent usage meter as part of Macroscope's usage-based pricing. That means you pay for the work the agents actually do (not per developer per month), and every new workspace starts with $100 in free usage to run a real evaluation. Manage costs by using exclude patterns to skip irrelevant files, choosing full_diff over code_object input mode, and selecting appropriate effort, reasoning, and model levels for each agent. Light checks can run on Sonnet at low effort for pennies. Heavy security reviews can run on Opus 4.7 with high reasoning when the stakes justify it.

Why is usage-based pricing better for Check Run Agents?

Check Run Agents replace tools (linters, GitHub Actions, Slack webhooks, Sentry scripts) that historically did not charge per seat. Putting per-seat pricing on top of that work would tax engineering teams as they add headcount, even if PR volume stays flat. Usage-based pricing tracks the actual review workload instead, which is also how the rest of Macroscope's code review platform is priced ($0.05 per KB of diff reviewed, with per-review, per-PR, and monthly workspace caps for predictability). It also scales sanely as AI coding agents push more PRs per developer.

What AI models do Check Run Agents use?

The default model is Claude Opus 4.6. You can also select claude-opus-4-5, claude-opus-4-7, claude-sonnet-4-5, claude-sonnet-4-6, gpt-5-2, gpt-5-4, or gpt-5-5 via the model frontmatter field. The reasoning field controls extended thinking depth (off, low, medium, high). Use higher reasoning for complex security reviews and nuanced judgment calls, and Sonnet at low reasoning for cheap, simple checks. Because pricing is usage-based, you only pay for the model and reasoning depth you actually use.

Can I have multiple Check Run Agents on one repository?

Yes. Every .md file in .macroscope/ becomes a separate, independent check run. A repository can have a security review agent, an accessibility agent, a changelog enforcer, an API contract checker, and a PR labeling agent all running in parallel on every pull request.

Do Check Run Agents work with monorepos?

Yes. Use the exclude frontmatter field with glob patterns to scope each Check Run Agent to relevant directories. For example, a frontend review agent can exclude "*.go", "*.proto", and "services/**" to focus only on web files, while a backend agent excludes "*.tsx" and "*.css".

What are the two input modes for Check Run Agents?

Check Run Agents support two input modes. full_diff processes the entire PR diff in one agent — lower cost, best for PR-level checks like "is the changelog updated?" code_object runs up to 20 agents in parallel, one per changed code object (function, class, method) — higher cost, better for per-unit enforcement like "does every new function have error handling?"

Are Check Run Agents a good CodeRabbit alternative?

Yes. Teams evaluating CodeRabbit alternatives often choose Macroscope because Check Run Agents offer capabilities that CodeRabbit's custom rules cannot match — including autonomous codebase investigation, external service access (Sentry, Slack, PostHog, LaunchDarkly), and the ability to take actions like adding labels, posting to Slack, or creating Jira issues. CodeRabbit's custom rules adjust review behavior via configuration. Check Run Agents are full AI agents that investigate, reason, and act. See the Macroscope vs CodeRabbit comparison for a full breakdown.

Are Check Run Agents a good Greptile alternative?

For teams evaluating Greptile alternatives, Check Run Agents offer explicit, auditable enforcement rather than implicit pattern learning. Greptile learns your team's standards by observing PR comment behavior over time — no configuration needed, but no explicit control either. Check Run Agents let you define exactly what gets checked in plain English, with access to external tools. Both approaches have merit; Check Run Agents are better for teams that want deterministic, documented enforcement standards.

Can Check Run Agents fix code automatically?

Check Run Agents focus on review and enforcement. For automatic code fixes, Macroscope's built-in Correctness check includes Fix It For Me — an AI code fixer that automatically opens fix PRs for detected bugs and iterates until CI passes. Check Run Agents and Fix It For Me work together: agents find issues, Fix It For Me resolves them.

Where do Check Run Agent results appear?

Check Run Agent findings appear in three places on GitHub: (1) the Check run details page in the Checks tab with the full investigation report, (2) inline PR comments as line-level annotations on specific diff hunks, and (3) top-level PR issue comments for broader findings and summaries. You control the output format in your agent's instructions.