Skip to main content

Sigilix

Sigilix is a hosted GitHub App for multi-agent AI code review — the mark of merge-ready code. Install it once, and a review fires automatically on every pull request. Believability here is architectural, not a prompt: a finding cannot post unless it earns its way through a sequence of gates. Every pull request is read in parallel by four domain specialists — Metis (logic and architecture), Argus (security), Iris (performance), and Eunomia (tests) — and then unified by Harmonia, the synthesizer that cross-checks their findings, deduplicates noise, calibrates severity, and posts a single coherent review. Before the LLM specialists fire, a deterministic layer scans the diff for secrets, dependency vulnerabilities, AST rule violations, and any regex rules you’ve defined in sigilix.json. Those findings are injected into the specialists as authoritative facts. And before any specialist judges a line, Sigilix expands context — pulling callers, dependencies, and symbols from the code graph, the repo index, and review memory — so the diff is judged against the whole repository, not a window. The output looks like one comment from a senior engineer who’s read everything carefully.
Sigilix is in private beta. Join our private beta →

Quickstart

Install the GitHub App, point it at a repo, and get your first review in minutes.

The Believability Pipeline

The five gates a finding must clear before it can post — evidence, provenance, refute/execute, proof-tier receipts, memory.

The Dispatcher

How one webhook routes to many pipelines — overview, ensemble review, CI triage, issue triage, Q&A.

Conversational Learnings

Teach Sigilix a rule in plain language and watch it apply — and attribute — that rule in later reviews.

The five specialists

Sigilix doesn’t run one generalist model. It runs four domain specialists in parallel, each focused on what it does best, unified by a fifth role — the synthesizer.

Metis · Logic & Architecture

Correctness, control flow, edge cases, architectural fit.

Argus · Security

Injection, authz gaps, secret exposure, unsafe patterns. Security never silently skips.

Iris · Performance

Hot paths, allocations, N+1s, algorithmic regressions.

Eunomia · Tests

Coverage gaps, deleted assertions, weak or misleading test changes.
Each specialist runs a model tuned to its role — a reasoning-heavy model for logic, a faster high-volume model for security — behind a cross-provider fallback so one provider’s outage can’t silence a specialist. Harmonia runs after the four, cross-checks their findings against each other, drops duplicates, calibrates severity, and writes the single review you actually read.

Why findings are believable

Every posted finding earns its place by clearing the five-stage believability pipeline:
1

Evidence

A finding must point at concrete evidence in the diff or surrounding code — not a vibe.
2

Provenance contracts

The finding must cite the exact code it’s about; claims that can’t be tied to real lines are rejected.
3

Refute / execute

Sigilix tries to refute the finding — and, where it can, checks the claim by execution — before letting it stand.
4

Proof-tier receipts

Every survivor carries a tier pill: VERIFIED (checked by execution or a signed receipt), GROUNDED (anchored to cited code and evidence), or MODEL (model judgment).
5

Memory

Past reviews, dismissals, and learned rules shape what posts next time — so the system gets quieter and sharper on your repo.
This replaces any single numeric confidence score: instead of “4 out of 5,” you see why a finding is trustworthy and what kind of evidence backs it.

One webhook, many pipelines

A single GitHub or Linear event doesn’t always deserve the full ensemble. The dispatcher routes each event to the pipeline it earns — a PR overview, a full ensemble review, CI-failure triage, issue triage, describe/improve, or an @sigilix Q&A. Cheap passes like the overview don’t wait behind the heavy ensemble, so fast feedback stays fast. See The Dispatcher.

The earned context layer

Every review deposits a reusable, verified understanding of your repo — the index, the code graph, the trust ledger, review memory, and evidence manifests. That earned context is what the Sigilix CLI and Deep-Research Chat and Triage all draw on, so each review makes the next one cheaper and more grounded.

Conversational Learnings

Disagree with a finding? Reply in plain language — @sigilix remember we use integer cents here — and Sigilix records the rule, applies it judiciously in later reviews, and attributes it inline (“applied because of a learned rule”). A “Learned something new” footer confirms capture; @sigilix forget … walks it back. See Conversational Learnings.

What Sigilix is not

  • Not a code generator. Sigilix surfaces findings with line references and suggested patches. Your team decides what to merge.
  • Not training on your code. Code is sent to model providers under their data-use commitments and discarded after the review. We don’t fine-tune on customer repositories.

What you’ll see on every PR

When Sigilix reviews a PR, you get one review from sigilix[bot] containing:
  1. A Harmonia summary — what changed, what concerns survived dedup, and whether the verdict is Approved or Request changes.
  2. Inline findings anchored to specific lines, tagged by specialist (Metis, Argus, Iris, Eunomia) and severity (Critical / Warning / Info), each carrying its proof-tier pill.
  3. Suggested patches for findings where a clean fix is in scope.
The review goes where developers already are — in the PR thread. Beyond the PR, the same engine powers the Sigilix CLI and Deep-Research Chat, with BYO models (Codex CLI, Claude Code, or your own SDK/keys).

Where to start

If you’re evaluating Sigilix, skim the believability pipeline to understand the depth claim, then install on a single repo to see it review a real PR. If you’re already installed and need to tune behavior, head to Configuration.