Skip to main content
This page traces a single ensemble review end-to-end so you know exactly what happens between a git push and a sigilix[bot] comment.

The dispatcher: one webhook → many pipelines

Sigilix does not run one pipeline for every event. A single incoming GitHub or Linear webhook is first routed by the dispatcher to the pipeline that event deserves:
EventPipeline
PR opened / first lookPR overview (cheap, fast)
PR opened / synchronizeEnsemble review (the five specialists — this page)
CI failureCI-failure triage (grounded inline root cause)
Issue opened (Linear / Jira)Issue triage (title, priority, estimate, labels, linked PR)
/sigilix describe / /sigilix improveDescribe / improve
@sigilix <question>Grounded Q&A
@sigilix remember / forgetConversational Learnings capture
Cheap passes don’t wait behind the full ensemble — the overview can post while the deeper review runs. See The Dispatcher for the full routing table.

Trigger conditions (ensemble review)

The ensemble review pipeline fires on these GitHub webhook events:
EventWhen it fires
pull_request.openedNew PR created
pull_request.synchronizeNew commit pushed to the PR branch
pull_request.reopenedA closed PR is reopened
pull_request.ready_for_reviewA draft PR is converted to ready
Mention triggers (@sigilix on a PR comment thread) are routed by the dispatcher to a separate, lighter pipeline that posts a single conversational reply — not a full review.

The pipeline stages

0. Dispatcher routes the webhook → ensemble-review pipeline

1. Webhook received

2. Pre-flight: dedupe + stale-head guard

3. Diff fetch + reviewable-position parse

4. Incremental-review check (synchronize only)

5. Router: which specialists fire?

6. Deterministic pre-LLM layer (secrets, AST, deterministicChecks)

7. Hunk-level diff packing (for large PRs)

8. Context before judgment (code graph, index, review memory)

9. Specialist fan-out (parallel, size-scaled budget)

10. Quorum check

11. Synthesizer (Harmonia) — grounding gate, proof-tier receipts, memory calibration

12. Post-validation (provenance + line check)

13. Stale-head guard #2

14. Submit to GitHub (inline anchors, anchorless fallback if rejected)

15. Telemetry + feedback-signal recording

1. Webhook received

GitHub POSTs a webhook payload to Sigilix. Sigilix verifies the HMAC signature against GITHUB_WEBHOOK_SECRET and rejects unsigned or wrong-signed payloads with 401.

2. Pre-flight: dedupe + stale-head guard

Sigilix checks KV for an existing review on (prNumber, headSha). If found, the new webhook is a duplicate (e.g., GitHub redelivered) and is skipped. The stale-head guard fetches the PR’s current head SHA from GitHub. If the webhook’s SHA doesn’t match, the push has been superseded — the review is aborted and the new webhook will fire its own.

3. Diff fetch + reviewable-position parse

Sigilix fetches the unified diff via the GitHub Pulls API. The diff is parsed into a map of path → reviewable positions, where each position records { line, side: LEFT|RIGHT }. This map is used later to validate inline anchors.

4. Incremental-review check (synchronize only) — ARC-194

When a pull_request.synchronize event arrives (a new commit pushed to an open PR), Sigilix checks whether the previous review’s findings are still relevant. Findings that point at lines unchanged since the last review are carried forward with a “still applies” marker rather than being re-discovered. New code added in this push is reviewed fresh. Incremental review avoids the re-flag loop where the same finding surfaces on every push.

5. Router

The router decides which specialists to fire based on the diff:
  • Docs-only PR (markdown / RST changes, no fenced code blocks with executable content) → no specialists; PR overview only
  • Auth-sensitive paths touched → security forced on
  • Migration / schema files changed → security + logic forced on
  • Lockfile or package.json changes → security forced on (supply-chain)
  • Default → all four domain specialists (Metis, Argus, Iris, Eunomia)
The router runs with conservative gating — when in doubt, it includes the specialist. Diff size is not a routing signal; CI status is not a routing signal.

6. Deterministic pre-LLM layer

Three pre-LLM subsystems run on the diff in parallel:
SubsystemWhat it does
Secret scanning (ARC-187)Regex sweep for AWS / Stripe / GitHub / Slack / GCP credentials
AST rule packs (ARC-181)JS/TS AST visitors for a starter rule pack (currently no-eval-call)
deterministicChecks (ARC-193)User-defined regex rules from sigilix.json
Their findings are injected into the specialist prompts as authoritative facts. The specialists then reason about these findings (where does this key get used? is this eval reachable from a request handler?) instead of having to re-discover them.

7. Hunk-level diff packing (large PRs) — ARC-192

For PRs whose diff exceeds the per-attempt token cap, Sigilix packs the diff at the hunk level rather than truncating arbitrarily. Each hunk is treated as the unit of inclusion; the algorithm prefers keeping whole hunks intact and dropping low-priority ones (e.g., docs-only hunks first, then generated-content hunks, then truly large ones with a synthetic placeholder). A telemetry event (pr-review-truncated) fires on every truncated review so the team can track which PRs are hitting the cap.

8. Context before judgment

Before any specialist runs, Sigilix expands the diff into the context it needs to be judged against the whole repo, not a window:
  • Code graph. Callers, dependencies, and symbol definitions for the changed code, so a specialist can see who calls the function being modified.
  • Index retrieval. A retrieval call against the codebase index for the PR’s title + body; the retrieved chunks are passed to all specialists, so they don’t each re-retrieve.
  • Review memory. Relevant Conversational Learnings and category calibration for this repo.
This is what stops a specialist from flagging “function is unused” when it’s called three files away. The accumulated repo understanding — index, code graph, trust ledger, review memory — is the earned-context layer that the CLI and Deep-Research Chat also draw on.

9. Specialist fan-out

The four (or fewer, per router) domain specialists run in parallel via Promise.allSettled with a size-scaled budget (small diffs finish fast; large diffs get more time, up to the platform wall). Each gets:
  • The diff (hunk-packed if oversized)
  • The PR title + body
  • The context snapshot (code graph + retrieval)
  • Deterministic findings to reason about
  • Any applicable learned rules
  • Its own system prompt
  • A model tuned to its role, with a cross-provider fallback
SpecialistRoleModel selection
MetislogicReasoning-heavy primary, cross-provider fallback
ArgussecurityFast high-volume primary, cross-provider fallback
IrisperformanceThroughput-tuned primary, cross-provider fallback
EunomiatestsFast broad-coverage primary, cross-provider fallback
Specific model IDs are tuned over time from telemetry and intentionally not pinned in the docs. If a specialist’s primary model fails (transient 503/429 or budget exhaustion), retry once. If retries are exhausted, the cross-provider fallback fires once. If the fallback also fails, that specialist’s contribution is skipped.

10. Quorum check

Sigilix requires a minimum number of successful specialists to proceed. If quorum isn’t met, the review is aborted with internal_error and the queue retries.

11. Synthesizer (Harmonia)

See the Synthesizer page for the four sub-stages. Harmonia runs a calibration-strong model with a cross-provider fallback. This is where the grounding gate runs and each surviving finding earns its proof-tier receipt. Cross-PR review memory feeds into this stage to calibrate severity per-category and apply learned rules.

12. Post-validation

Before posting, Sigilix validates:
  • Every inline finding’s path:line is a valid reviewable position in the diff
  • Every finding’s structural provenance (which specialist findings it traces back to) is intact, and its proof-tier receipt is attached
  • Low-severity nit findings are aggregated into the summary, not posted inline
Findings that fail validation are dropped. The verdict and inline-finding list are finalized.

13. Stale-head guard #2

A second stale-head check runs immediately before submission. If the PR’s head SHA has changed since pre-flight, the review is aborted and the new webhook will fire its own.

14. Submit to GitHub

Sigilix posts the review via POST /repos/:owner/:repo/pulls/:number/reviews with event: APPROVE | REQUEST_CHANGES, the body containing the synthesizer summary, and comments: [] for inline findings. If the inline-anchor submission fails (422 — bad position), Sigilix falls back to an anchorless review with all findings rolled into the body. The verdict is still posted. When PrReviewRunDO is in mode=post (ARC-190), the pr-review webhook ingest path is serialized through a Durable Object across regions, so the same PR cannot have two concurrent in-flight reviews. While the DO is still in mode=off (the current production state during staged rollout), per-PR dedupe falls back to the KV path’s best-effort guard.

15. Telemetry + feedback signals

Two records land:
  • Review telemetry — duration, finding counts, severity distribution, specialist outcomes, fallback flags. Internal monitoring + per-PR cost accounting.
  • Feedback-signal capture — PR-thread reactions, “Commit suggestion” applies, reply text matching accept/dismiss patterns. Feeds the cross-PR review memory.

Guarantees

  • Idempotency. Re-firing the same webhook never produces a duplicate review (best-effort KV dedupe today; strict Durable-Object serialization in mode=post, ARC-190).
  • Stale-safe. A review is never posted on an outdated SHA.
  • Verdict survival. If GitHub rejects inline anchors, the verdict is still posted (anchorless fallback).
  • Cross-provider resilience. Size-scaled budgets plus cross-provider fallbacks reduce the risk that a same-family provider outage silences a specialist. A specialist can still be skipped if both primary and fallback attempts fail, in which case the review is posted with a _3 of 4 specialists succeeded_ footnote.
  • Incremental on synchronize. A push to an open PR re-reviews only what changed; carried-forward findings keep their original markers (ARC-194).
  • No silent dropping. Every finding either makes it into the review or is logged as suppressed in telemetry.

The Dispatcher

One webhook → many pipelines: how each event is routed.

Confidence & Proof Tiers

Proof-tier receipts and the grounding gate that govern what posts.