Review Memory

A single-pass reviewer makes the same mistakes on every PR. If your team consistently dismisses “consider adding a docstring” findings, that finding shows up again next week, and the week after. The reviewer never learns. Review Memory (ARC-189) is Sigilix’s cross-PR learning layer. It records which findings the team acted on and which they dismissed, and uses that signal to calibrate future reviews on this repo.

What’s a feedback signal?

Sigilix watches PR activity post-review for these signals:

Signal	Interpretation
Finding’s suggested patch was committed (via “Commit suggestion” button)	Strong accept
Reviewer replied to the finding’s thread with “Thanks, fixed” / “Good catch”	Accept
Reviewer replied with “Not applicable” / “False positive” / 👎 reaction	Dismiss
PR merged with the finding’s underlying code unchanged	Soft dismiss
Reviewer added the bot to the resolution chain (resolved their thread)	Soft accept

These are noisy signals on a per-finding basis. The memory aggregates them across many PRs to identify patterns — categories the team consistently dismisses, categories they consistently act on.

How the memory shapes future reviews

The memory does not silence findings outright. It nudges Sigilix’s flag-worthiness floor per category, per repo:

Category that gets dismissed 8/10 times on this repo
  → flag-worthiness floor raises (only stronger versions surface)

Category that gets acted on 8/10 times on this repo
  → floor stays low (any signal in that category surfaces aggressively)

The memory is scoped per repo + per category, never per file or per user. It doesn’t try to model individual reviewer preferences — it models team-level patterns.

What the memory remembers

A small, structured record per category:

repo: acme/site
category: style-naming
acted-on: 4
dismissed: 22
ratio: 0.15  →  raise floor by 1 score

The actual record is more nuanced (recency-weighted, with confidence intervals) but the shape is the same. Sigilix stores aggregate counts, not the underlying findings or the code they referred to. There’s nothing about your specific code retained.

What it does not do

Concern	Reality
”Will this learn my team’s biases?”	It learns category-level accept/dismiss patterns, not opinions about specific code.
”Will it stop catching real bugs?”	The floor adjustment is bounded. A `critical` security finding always surfaces regardless of memory state.
”Will it remember a specific bug?”	No. The memory stores aggregate signals, not findings.
”Can I see what’s in the memory?”	A read-only inspection view ships on the upcoming Pro/Max dashboard.
”Can I reset it?”	Yes — `{ "reviewMemory": { "enabled": false } }` for a review, then re-enable. The data isn’t deleted but is ignored. A hard reset is a support request.

When you want it on

For most repos, leave it enabled. The dismiss-then-resurface loop on style/taste findings is the single biggest source of reviewer fatigue on first-generation AI review tools. Cases where on is clearly right:

Established codebase with a consistent review culture
Active team that engages with findings (accepts, replies, dismisses)
Codebases where review fatigue is a documented complaint

When to disable

Disable via { "reviewMemory": { "enabled": false } } if any of these apply:

Bot-driven approvals. If most PRs are auto-merged without human engagement, the memory is being fed near-empty signals and will calibrate poorly.
Brand-new repo. With few reviews on record, the memory’s adjustments are noisy. Wait a few weeks then enable.
Compliance contexts. Some teams need every category of finding flagged at the configured severity regardless of historical patterns. Leave the memory off; rely on the synthesizer + profile instead.
Audit mode. When you specifically want the unadjusted ensemble (e.g., for a baseline comparison or a one-time security review), disable temporarily.

Interaction with other features

Feature	How they relate
`profile: assertive`	Lowers the flag-worthiness floor globally. Review memory still applies on top — categories the team consistently dismisses still get a relative raise.
`deterministicChecks`	Unaffected. Memory only adjusts LLM-specialist findings, not regex matches.
Secret scanning, AST rules	Unaffected. Same as deterministic — perfect-provenance findings are never down-weighted by memory.
`rules.<role>`	Unaffected. If you’ve told a specialist to flag X, memory doesn’t override that.

The general rule: memory adjusts LLM judgment calls, never deterministic findings or explicit rules.

Telemetry

Every review records:

Which categories had memory-driven adjustments applied
The current accept/dismiss ratios per category (recency-weighted)
The size of the adjustment (delta in flag-worthiness floor)

Used internally for monitoring; the eventual user-facing surface is the upcoming Pro/Max dashboard.

Opt-outs

Disable review memory and other subsystems.

Confidence Scoring

How scores become severities at the finding level.

Getting Started

How It Works

Configuration

Evidence & Provenance

Integrations

Operations & Releases

Troubleshooting

What’s a feedback signal?

How the memory shapes future reviews

What the memory remembers

What it does not do

When you want it on

When to disable

Interaction with other features

Telemetry

Read next

Opt-outs

Confidence Scoring

Getting Started

How It Works

Configuration

Evidence & Provenance

Integrations

Operations & Releases

Troubleshooting

Documentation Index

​What’s a feedback signal?

​How the memory shapes future reviews

​What the memory remembers

​What it does not do

​When you want it on

​When to disable

​Interaction with other features

​Telemetry

​Read next

Opt-outs

Confidence Scoring

What’s a feedback signal?

How the memory shapes future reviews

What the memory remembers

What it does not do

When you want it on

When to disable

Interaction with other features

Telemetry

Read next