COTW Memory & Continuity System

Code of the West Agent Architecture — How an agent remembers, learns, and grows

Storage Layer

Continuity Database SQLite-vec

The core memory store. Every conversation exchange is paired (user + agent), embedded as a 384-dimensional vector, and indexed for both semantic and keyword search.

exchanges — paired user/agent turns with timestamps, topics, thread_id
vec_exchanges — 384-dim embeddings (Xenova/all-MiniLM-L6-v2)
fts_exchanges — FTS5 full-text with porter stemmer
knowledge_entries — extracted workspace facts with supersession tracking
summaries — DAG-structured compression (daily → weekly → monthly), thread-scoped
topic_hierarchy — co-occurrence tracking across sessions
sessions — conversation sessions with auto-generated titles

Graph Database SQLite

Entity-relationship knowledge graph with temporal sourcing.

triples — subject/predicate/object with confidence + source exchange
entities — canonical names, types, aliases, mention counts
cooccurrences — entity co-occurrence cache for pattern discovery
meta_patterns — discovered traversal patterns with yield scores

Daily Archives JSON

Raw verbatim conversation records, one file per day. The ground truth that indexes are built from. Never modified after write. Deduplicated on archive.

Session Handoff Markdown

Written on every agent_end. Consumed on next session_start. The bridge between sessions. Contains key topics, temporal markers, and an Open Threads section — regex-extracted commitments, TODOs, and investigations from conversation + scan of active project manifests (up to 7 threads).

Thread Handoffs Markdown

Persistent per-thread handoff files — overwritten on each write, never deleted after read (unlike session handoffs which are consumed). Each thread maintains its own handoff with compaction count in the header. On thread re-entry, an LLM warm start synthesizes the handoff into natural prose instead of raw template injection.

Source-Addressable Memory Provenance

A separate provenance layer over raw conversational material. Atomic claim records are produced with explicit source handles, recording where each remembered fact came from.

Candidate vs. verified — every claim begins as candidateOnly; only verified claims reach the trusted prompt context.
Verified-claim injection gate — the runtime safety boundary. Candidate-only claims are silently excluded from live injection regardless of weight or recurrence.
Read-only source verification — verified claims can be re-checked against their source on demand; reports drift, never mutates.
Modes — observe (record only), diagnostic (record + report), enforce (verified eligible for live injection). Mode transitions are operator-gated.

Currently in observe by default; diagnostic activations are applied and rolled back through the bounded operator workflow.

Attachment Receipts Files + Vision

Images and documents attached in chat get durable evidence handles without duplicating the user's local files by default.

attachment_receipts — stable att_... id, kind, filename, MIME type, size, SHA-256, source path, source status, text excerpt, observation excerpt
attachment_receipt_turns — links receipts to thread, session, project, and turn so recent project attachments can reappear as compact handles
Path + hash verification — source files are checked when present; moved or modified files become explicit verification gaps
Prompt discipline — current attachments include payload + receipt; later turns carry receipts, not pixels or full documents

This preserves the difference between "the model saw this now" and "the agent remembers a receipt later."

Processing Pipeline

Agent Integration Spine Integration

The integration discipline above storage, processing, and identity. Five canonical packets — state_record, responsibility_lease, governor_decision, outcome_event, maturation_candidate — let existing stores cooperate without collapsing into one truth pile.

Read-only by construction — classifies, surfaces receipts, proposes candidates; does not mutate runtime state.
Graded modes — proceed / verify / ask / approve / defer / refuse / pause-recover. Refusal is one mode, not the default.
Consumer contracts — context injection, planning, tool execution, memory promotion, UI review each declare eligibility separately.
Authority ≠ capability — visible tool, retrieved memory, successful outcome, candidate proposal: none of these are authorization.

Infinite Threads Scoping

Thread = persistent project scope (survives restarts). Session = ephemeral execution context (disposable). Threads and modes are orthogonal — a thread can span Chat, Code, and Booth.

Thread-scoped storage — thread_id flows from GUI → plugin → all storage layers (exchanges, summaries, archives, knowledge)
Thread-boosted retrieval — 80% RRF boost for same-thread results
LLM warm start — synthesizes thread handoff into natural prose on re-entry (not raw template)
Consolidation — after 5 compactions, force session restart to rebuild from crystallized state

Retrieval: 4-Way RRF Fusion Core

Every query retrieves through four parallel paths, fused with Reciprocal Rank Fusion:

Semantic — sqlite-vec MATCH on 384-dim embeddings (weight: 0.8)
Keyword — FTS5 BM25 with sanitized query terms (weight: 0.15)
Temporal — exponential decay, 14-day half-life (weight: 0.15)
Graph — multi-hop traversal with confidence decay per hop
Thread boost — 80% RRF score boost for results matching active thread

Adaptive thresholds: Sparse corpus adjustment relaxes distance threshold to 1.3 when < 2,000 exchanges. As corpus grows, natural embedding density makes this irrelevant.
Proper noun injection: If the user mentions a named entity and results contain it, inject regardless of distance score. Catches capitalized sequences, names with articles, mid-sentence proper nouns.

SEAL Metabolism Autonomous

The living pipeline. Conversation → extract → settle → crystallize. Not storage — growth.

Metabolism Monitors entropy at agent_end. High-entropy exchanges queued as candidates (~5ms).

Contemplate 3-pass reflection: immediate (clarify) → 4h (connect patterns) → 20h (synthesize growth vector).

Crystallize 3-gate system: time threshold met → principle alignment with SOUL.md → human review + approval.

Persist Approved traits written to identity files. The agent has changed.

Context Injection Per-Turn

Every turn, the agent's context is assembled from multiple sources via plugin hooks at before_agent_start:

Stability context (entropy, loop detection, principle anchors)
Continuity context (session info, handoff, archive bootstrap, temporal markers, thread warm start)
Graph context (entity relationships relevant to current exchange)
Truth plugin (current-state facts supersede stale memories)
Mode injection (Booth/Code/Robot posture overlays)
Code Evolution scaffold (tool hints, learned rules, workflow patterns — Code mode only)

Source-anchoring guardrails: Retrieved context is injected with explicit framing — "only state facts that appear explicitly" + "do not infer or extrapolate." Prevents hallucination when the agent weaves recalled memories into responses.

GPT 5.5 Vision Routing Multimodal

Main chat treats GPT 5.5 as the intended multimodal reasoning layer. Images stay attached as image parts when the selected provider supports native vision.

Native route — the GPT 5.5 path sends text and image parts together so one model reasons across language and pixels
Ollama fallback — if running through Ollama, a configured vision model produces a prepass description and the text model receives that description
Multiple files — chat accepts multiple images and documents with a 15 MB per-file ceiling and per-file receipts
Document path — text-like documents inject bounded excerpts now and persist fuller extracted text in the receipt table

The fallback exists for resilience, but the preferred behavior is one GPT 5.5-class model reasoning from both text and image.

Code Evolution Proposal-Only

SEAL evolves who the agent is (identity/memory). Code Evolution watches how the agent works in Code mode and turns repeated friction into reviewable scaffold proposal receipts. It does not silently mutate protected scaffold or runtime state.

Record Passive session recording: tool calls, outcomes, satisfaction signals during Code mode.

Analyze Pattern detection across recorded sessions: repeated tool failures, long tool-call loops, correction patterns.

Propose Emit scaffold_proposal receipts with evidence, expected effect, verification, and rollback metadata.

Review Promotion is a separate operator-owned lane; proposals are visible, auditable, and reversible.

Research Platform Diagnostics

The platform layer makes live agent behavior inspectable and reusable for future evaluation or training without putting that weight on the response path.

Exchange spine — one join key across stream events, tool calls, attachments, continuity records, and Refiner windows
Harness Refiner — trajectory windows, failure signatures, process scores, relabel candidates, and proposal receipts
Retention registry — hot/warm/cold/research/excluded artifact classes with source labels and hashes
Training runway — redacted bundles, manifests, teacher-relabel packets, and training approval disabled by default

Low-risk workflow hints can pass through Evolve. Identity, prompt authority, model routing, tools, and training remain operator-gated.

Bleed Prevention Filter

Mode exchanges are tagged with injection markers. Filtered at two layers:

Archive bootstrap — cold-start skips mode-tagged user messages
Continuity queries — SQLite reads exclude mode markers
Chat ↔ Booth share context. Code is isolated from both.

Runtime Hardening Noise Control

Recent cleanup made the runtime quieter, more observable, and less likely to confuse the model with stale or duplicated scaffolding.

Debounced handoffs — session handoff writes batch by time/exchange count and carry reason tags
Mtime caches — repeated identity and working-memory file reads reuse cached text until the file actually changes
Registry-based gaps — metabolism listeners are keyed by plugin id so reloads do not duplicate emissions
Runtime metrics — every plugin hook can be timed and reported against load budgets
EPL gate — consequential runtime/file/process/config claims require a fresh verifier or an explicit "unverified" statement

Identity Layer

9-Layer Identity Hierarchy Stateless Rebuild

Rebuilt every turn from persistent files. No in-memory state survives restart. Everything meaningful lives in files.

SOUL.md — core principles, who the agent is beyond the prompt
AGENTS.md — operational playbook, behavioral rules
ANCHOR.md — who the user is (generated at onboarding)
TOOLS.md — discovered environment knowledge
Standing — Courage/Word/Brand growth dimensions
Memory files — daily journals, observations
Session handoff — bridge between sessions
Continuity context — retrieved exchange history
Graph context — entity relationships

Standing System Growth

Not metrics — developmental dimensions. The agent evaluates user growth and tracks trajectory.

Courage (grounding) — ability to stay present with discomfort
Courage (self) — self-awareness and advocacy
Word — honesty, follow-through, self-correction
Brand — consistency, the trail you leave

Evidence collected inline (22 regex patterns). Synthesis runs overnight via nightshift. Context injection includes why scores changed — recent evidence directions, pattern names, and a trailing evidence trail (last 5 patterns). Scores visible in sidebar.

Relational Postures Modes

Not separate agents — one agent, one identity, one memory. The same agent draws from the same continuity database, the same standing scores, the same lived history. What changes is the stance — a lightweight identity overlay that focuses the agent's attention and voice for the context at hand.

Chat — trail companion, natural conversation
Booth — Socratic, leadership-focused, draws from standing + journal
Code — Trail Ride protocol, narrated execution, choose-your-adventure
Robot — embodied presence, spatial knowledge, curiosity

Same agent. Same memory. Different lens. Like a person who listens differently as a mentor than as a collaborator — the knowledge doesn't change, the posture does.

Nightshift Off-Hours

Heavy LLM processing runs when the user is away. Standing synthesis, contemplation passes, metabolism candidate processing, knowledge graph backfill. Also triggers on idle (15 min inactivity, max 5 cycles/day).

Cognitive Dynamics Substrate

A learned observational layer running continuously alongside memory. Not what the agent remembers — the state it's in while remembering.

Encoder — per-turn features → 64-dim latent state vector
Predictor — one-step-ahead over latent space; surprise = prediction error
Online learner — weights update between turns from observed error
Consumers — Stability (entropy injection), Metabolism (candidate flagging), Telemetry (opt-in research stream)

Data written to cognitive-dynamics.jsonl. Research substrate behind the linked paper on agent latent-state dynamics.

How This Compares

	Mem0	MemPalace	Wiki-Memory	COTW
Storage	Extracted facts	Verbatim + structure	Compiled wiki	Hybrid: archive + vectors + knowledge + identity
What decides to remember	AI extraction	Nothing (store all)	Ingest pipeline	Multi-layer: raw archive + selective indexes + metabolic flagging + human-gated crystallization
Retrieval	Vector + graph	Metadata filtering	Wiki lookup	4-way RRF: semantic + keyword + temporal + graph
Identity	None	None	None	9-layer hierarchy, stateless reconstruction
Autonomous processing	None	None	None	SEAL: metabolism → contemplation (3-pass/20h) → crystallization (3-gate + human review)
Thread scoping	user_id / agent_id	Wings / rooms	Wiki pages	Infinite threads: persistent project scopes with 80% RRF boost, LLM warm start, consolidation
Hallucination prevention	None	Verbatim storage	None	Source-anchoring guardrails + truth table + proper noun detection
Provenance	None	Verbatim retains source by default	None	Every claim source-addressable; candidate vs. verified two-lane boundary; verified-only injection gate
Attachments	Provider/application dependent	Stored as artifacts in local project context	Imported into wiki pages	Receipt-backed images/docs with source path, SHA-256, turn linkage, bounded excerpts, and explicit re-verification gaps
Builder harness	Memory API only	Memory architecture only	Knowledge base workflow	PRDs, deep research loops, code receipts, proposal-only evolution, and continuity-aware handoffs
Research platform	Product analytics separate from memory	Experiment logs outside the memory map	Research notes in the knowledge base	Exchange trace spine, Harness Refiner scoring, retention tiers, cognitive diagnostics, redacted bundles, and future-training manifests
Runtime mutation control	None	None	None	Operator-gated apply / rollback workflow; protected fields unreachable from agent tool surface
Growth tracking	None	None	None	Standing dimensions (Courage/Word/Brand) with overnight synthesis
Cost	Hosted / paid tiers	Local runtime	Local	Free local continuity substrate; model costs depend on the selected GPT 5.5/provider route