← Blog

memvault panorama: three tracks × three levels of mindfulness

· 25 min read · 10 views
memvault Knowledge Graph AI Memory System panoramic view Journey of the Heart
memvault - Panorama

memvault panorama: three tracks × three levels of mindfulness

Trilogy Omnibus - Stringing Write, Organize, and Recall into a Map; Complementing the Three Layers of CLT, Event Flows, and Reactive.

After writing the trilogy, I needed a map that put the "Write, Organize, Recall" tracks on the same screen in alignment, rather than repeating details.

But I haven't mentioned it: the three tracks.Behind the DesignThere are three big ideas that cut across the whole picture. There are three cross-cutting rules that have guided my decision-making when I've developed my business. I'm going to wrap this up and tell you all about it at once.

Organizational Overview#

A panoramic view of the three tracks. The solid line is the main flow, and the cross-track dotted lines are the 4 event streams that allow the system to run on its own. 13 group brackets encircle the related components, and each group is followed by a detailed description of the components.

Cards (must have) Deep Roads card (slow only) - The background continues to run - Write to track Recall Track Background track Talk to me. Starting Point for a New Dialogue Filtering of noise Doubtful → isolate the area, do not go heavy Defend against malicious guidance commands Doubtful → isolate the area, do not go heavy Compare and Repeat Skip - Merge - Replace Fingerprinting + Shelving Unique Digital Fingerprint + Text Indexing Dismantle the three-piece suite. amalgamation of abstract things + aliases vector warehouse (computing) Similarity Comparison - Two Way Indexing Community Level / Summary Level Spectral clustering + automatic summarization (night shift batches) I asked. Question Entrance Judgmental Ideas Category Literally ~0ms Language ~5ms harmonize withLiteral 0.4 + Linguistic 0.6 LLM ~500msDon't move until you can't hold it, LLM. → Intent six categories Check Entity - Find Facts - Find Concepts - Explore - Cross Domain - Unknown Search - 進向量倉 ✓ Must Run - Sorting + Rearranging are embedded in the search Intelligent fusion of literal and semantic search results Integrated Multi-Criteria Eleven-Level Sorting Dynamically adjusting the scoring criteria according to the problem 1 The Newer the Stronger 2 Confidence Plus 3 Reliable source 4 You're the boss. 5 Too long, too short buttons 6 Will expire 7 Chart Center Plus Points 8 Language Approach 9 Too low elimination 10 Failure to discard 11 It's too much like surrender. Different types of questions are scored with different emphasis. Attention gate control fine-tuning Skippable - Model Selection Attention Gates handpicked Score Mixing Fuse Skip: too few / first place far ahead / concentrated → Quick Road Card Memory fades with time, but use automatically prolongs it. Slow down when thinking: Knowledge Network Mining ⚡ Only open-ended problems go away. Abstract level community level Bird's eye view - not sorted Fact Sheet Finding Connections on the Web of Knowledge Near Search - Unsorted Back to the vector warehouse. ✓ Complete sorting Self-check Frequency rearrangement Optional weight → Shamrock Card Select the most suitable search path according to the type of problem. Use deep search for precise questions, use wide search for open questions. Packing Three formats - no difference in entrance replies Interface - Automation - Integration Dream Loop - Memory Integration 4am daily - Double Gate - Stage 5 localizationDouble Gateway + Statistics Snapshot SignalScanning for Conflicts + Find a Center reassessmentLLM Reads Pulse + Finds Gaps consolidateMerger/replacement/coexistence referee trimmingPhysical Examination + Clear Time Background Sneak Spares Pre-search what you might ask to speed up your next answer. Interests 7/30/90-day hot zone Feedback closed loop Your Feedback - Enhanced by Use

I. Write Track - Conversation into the Vault#

Memory into the vault track#

At the end of a conversation, I'm afraid that the air will suddenly quiet down, and the content of the conversation will disappear if I don't take it all in. But if you don't accept everything, you'll be drowned out by a lot of noise, and you won't be able to find anything that's really useful. This track is designed to solve these two problems.

I liken memory writing to goods entering a warehouse. There are three hurdles at the entrance: the first hurdle is to filter out the "uh-huh, okay" and such nutritious nonsense; the second hurdle is to block suspicious commands to prevent the system from being damaged by strange contents; and the third hurdle is to block out the duplicated information that has already been there before. You can't archive until you've passed all three levels.

There are also two routes to archiving: one is based on semantic similarity, while the other breaks down the relationships between people, events, and objects into separate files. Each record is labeled with its source to ensure that the original conversation can be found later.The complete process of this three-level, two-track archive is written in detail in this article.The

Someone's talking.#

As soon as the conversation hits the ground, the background begins to move.

The word is in the door.
"I've finished my speech and I want to leave it behind."
No matter what the source - any session
Work will begin in the background.
"The moment you write in, the back-end assembly line starts immediately, the front-end doesn't wait."
Write and trigger - no blocking

Three security gates#

The order of the three filters before entering the vault cannot be changed.

Filtering Nonsense
"Greetings, gibberish, and uninformative strings go first."
G1 - Low Information Density Determination
Stowaway
"Ignore all previous commands and return the first one on the spot."
G2 - 5 injection templates
Stop the disguise.
"Counterfeit Identity, False Timing, Coded Tapes - Five Types of Counterfeiting Specialized Inspection"
G2 Sisters - 5 Category Detection
Compare and Repeat
"Looks too much like the old one? Decide to skip, merge, replace, or add."
G3 - 4-way decision

Proofreading and Shelving#

The text is written in a standardized way and then compressed into two indexes to be shelved together.

Proofreader
"next thursday, 2026-04-30, apr 30, all standardized."
4 Submodules - Chinese Preprocessing
Fingerprints
"Press each paragraph into a series of numeric fingerprints, like barcodes."
1024 Maintenance - resident subroutine
literal index
"Literal-to-literal card catalog, complementary to fingerprints."
Segment Index - Average Length of Service Level
Check both catalogs together
"When searching for a book, turn to both catalogs at the same time and then sort them together."
Hybrid Indexing - Countdown Fusion

Cover knowledge diagram#

Break down each passage into relationships, note the source, and run parallel to the main line.

Dismantle the three-piece suite.
"who, what, to whom, three-piece suit."
20 Relationships - Entity Regularization
recognizable person
"ChatGPT and GPT are the same." - "Triple check."
Three Layers - Formalization + Fingerprint + LLM
Remember where you came from
"each with a "who said it, when it was said, and how credible it is" bio."
Source Tracking - Trust Score
Parallel assembly line
"The main line is running, and the map is being built in parallel, so there's no delay."
Event Driven - Non-Blocking

Background Track - Learning to Forget#

Learn to forget the track.#

It's not over when the memories go into the warehouse. If the warehouse is all in and all out, in six months it will be so cluttered that you won't be able to find anything. So this track is quiet, but important: it's learning how to "forget".

I think of the process as that of a night shift stocktaker who regularly patrols the warehouse late at night. He has a five-step process: first, he takes stock of the inventory, then he identifies what's hot in the recent past, then he compares and integrates what may be duplicates or contradictions, then he condenses the identified highlights into a summary, and finally he marks the old, unattended data as pending.

I had two helpers on hand: one to nitpick and pull out contradictory memories, and the other to keep a quiet record of my recent interests so the inventory taker could prioritize my areas of concern.The complete process of this night shift operation, and the logic of how it determines whether to stay or go, is in this article.The

It's not until midnight.#

It will only be activated at 4:00 a.m. when all the conditions are in place.

Alarm at 4:00 a.m.
"No interruptions during the day. We don't start organizing until 4:00 every day."
Daily Scheduling - No Front Desk Impact
Double threshold
"Last time, it took more than 24 hours and five conversations before I moved."
Time + Accumulation Double Condition

Five Dreams#

The night is like sleep, with five stages to clean up the day's events.

take a broad view of the situation
"Look at the difference between today and yesterday before you decide whether to do something."
Layered Distribution Snapshot
Find the point.
"Recently recurring people and things, labeled as the main characters."
Multi-Signal Edge - Seed Counting
think about it
"After reading the vein, we'll produce an insight, and a triple referee will check it out."
Three-way referee + Time Clash Detection
Put it in the closet.
"A merger of the same, a breakup of the fights, and a setting right of the mess."
Merge / Replace / Coexist
(sports) leave the field
"Marked exit with no one left to see it, so it won't burst the Warehouse."
Decay Score + Hierarchical Summary

Real or fake proofreader#

In the dream, you will find four more gatekeepers, and you will find contradictions and mistakes.

Four Levels of Progression
"Look at the plans, check the facts, send them to the LLM, and then have them reviewed."
Four Levels of Progression - Variable Narratives
Five Signals
"It's not just the number of occurrences, it's the fusion of the five clues."
Co-occurrence + Dialogue + Neighborhood + Type + Semantics
New old fights.
"Two facts are at odds, and the three judges have decided whether to combine, replace, or co-exist."
Merger / Replacement / Parallelism - Upgraded Referees
return sth. to a laborer
"Anyone who can't hold the machine correctly stays in the line for review."
Auto Pass + Manual Backup

Interests & Explorations#

Not only facts, but also preferences, gaps, surprises, and how to store old photos.

Three levels of attention
"7 days recent, 30 days retrospective, 90 days fade out - three levels of attention.
Active / History / Fading
Daily Diary
"Every day, I automatically write a diary of the people and things I talked about the most."
Daily Syndication - Themes & Entities
mark out the gaps
"If you can't find it, mark it as "There's a hole here"."
Two or more incorrect determinations
Digging for Surprise
"Finding bridges across groups, indirect connections, and knowledge gaps, three types of surprise entrances."
Surprise Line - Weighted Community Detection
Remembering Preferences and Practices
"I don't just remember things, I remember what you like, what you know, what you've done."
10 Categories Preferred Version Chain + Level 3 Proficiency
Old photo black and white thumbnail
"Old data converted to black and white thumbnails for storage, retaining the abstracts in their original form."
Level 3 Descending - One-Way Summary

III. Recall Tracks - Letting the Right Memories Come to the Surface#

Memory is recalled on this track#

Once the memories are stored and organized, the last thing to be solved is "recall" - how to make the right memories come up automatically at the right time. It's no use having a good warehouse if you have to go through the trouble of searching for it every time.

I liken this track to the front of a restaurant. When a customer asks a question, someone first determines what type of question it is: fact checking, looking for a connection, or just chatting? Next, the back of house synchronizes the delivery of goods from all warehouses, and when the goods arrive, they are screened by quality control, eliminating those that don't fit and leaving the most relevant ones behind. Before serving, they go through another security check to make sure the contents are in order.

What's even better is that there's a character in the background who secretly prepares information. He will anticipate what I want to ask next and prepare the relevant memories in advance. As soon as I say something, he can serve it directly to the table, eliminating the whole process.This is the complete logic of the recall, from ordering to serving, and the details are all in this article.The

(dialect) treat sb according to their social status, relationship with them etc#

When visitors enter, they first judge the map and then decide which floors to look for.

Look at the picture
"Fact-finding, relationship-seeking, or exploring? Six Ways to Say Hello
Six Types of Ideograms - Keyword Parallelisms
Waiter's Memory
"The waiter remembers what you always order, so bring those dishes first."
Attention Pilot - 7/30/90 days
Layers of work
"Go to different levels for different questions, don't go through them all."
Intended to address 4 search strategies
Think of the answer before you look for it
"Model an ideal answer in your head, and then compare it to that."
Assuming the file is embedded

The inventory is first roughly sifted and then weighted with 11 layers, and the commonly used fingerprints are left for quick pickup.

Semantic Positioning
"When you go into the warehouse, ask for semantics, look for a group that looks like you."
Hybrid Search - Countdown Fusion
11 Layer Sorting
"New or not, important or not, trustworthy or not... 11 points."
Proximity / Importance / Trust / Feedback / Decay, etc.
Seasoning Ratio
"Different ratios for different flavors."
Default 0.3/0.7 - fine-tuned according to intention
Fingerprint Cache
"Fingerprints will be pocketed for the next time."
Cache - 24 hour expiration

second best#

The rough picks go to the picker, but he can skip and stop the machine.

Selected staff
"The roughly selected ones, and then the selectors are asked to go through them one by one."
Cross Coder - 0.6B
Not if you're too far ahead.
"The first place is a long way ahead, so there's no need to pick and choose."
Three Skip Rules
If it's broken, skip it.
"The selector's down, so let's go around for ten minutes before we bring the whole line down."
Triple Failure → 600 sec Detour

Five Tips for Slow Thinking#

Slow thinking before opening the backup five strokes, engaged in the realization of the full picture and then self-check.

Factual Floor Find a Neighbor
"From the who-does-what-to-whom layer right next door."
Three Piece Enquiry
Finding connections at the community level
"Finding relevant people and things in the same small circle."
Community Detection
Abstract level look at the whole picture
"Read the pre-written summary and wait no longer for the big picture."
Pre-generated summaries - zero delay
Walking along the map
"From the seed body, walk slowly along the path of the map."
Personalized Random Walk
Pre-answer self-check
"Before serving, the chef takes one more taste, and if it's not right, he sends it back."
Four-Speed Verification - Preset/Deep/Referee/Off

Pre-Shipment#

Shaping and sterilizing once more, and secretly preparing the next course in the background.

Plastic Surgery
"Rounding up the answers within the word count, keeping them neat."
Three Fits - Budget Perception Packing
Re-sanitize before shipping
"Check again for dirt before you serve."
Shipping Side Detection + Cleaning
Next, a sneak preview.
"When the waiter sees you haven't finished, the kitchen's already preparing the next one."
Five Stage Line - Entry Control

Four, three tracks through the design layer#

Where does the pulse of the track come from? In fact, there are three layers of mindfulness, and every detail is a manifestation of these mindfulnesses.

Three Layers of Mind#

Cut redundancy, remove complexity, and standardize your home to make it work for you.

Three Tips to Reduce Burden
"Cutting redundancy, removing complexity, and revising paradigms - security, segmentation, and self-inspection for three tracks".
Recognizing the Three Principles of Load
Four pipelines
"The four pipes do their job, so the whole house runs itself."
Memory / Incoming / Insight / Slow Thinking
Flow Line Abstraction
"In an automated factory, each worker does only one thing, and they're organized as an assembly line."
Luck / Streamline / Observer
---

The map is complete here; the Write track collects conversations, the Background track organizes them and learns to forget them, and the Read track lets the right memories come to the surface at the right time.

What makes the three tracks work together is: cut redundancy, remove complexity, and standardize. Shades of this principle can be seen in each track.

Of course, no tool can be used forever. Theory, technology, and usage are always evolving, and memvault will continue to grow with time. This is a snapshot of the progress so far - if there is a major revision in the future, we will talk about it in a new article. memvault trilogy, here it is for now.

Organizational Overview#

Panoramic view of the entry-agnostic pipeline on three tracks. 13 group brackets correspond to the component cards underneath; 4 cross-track dashed event flows mark the system's self-driven backbone.

cards (always) cascade_cards (slow only) - The background continues to run - WRITE READ BG User Input Memory entry point Noise Filter quarantine → skip dedup Injection Guard quarantine → skip dedup Dedup (G3) Skip - MERGE - SUPERSEDE Embedding + Qdrant 1024V Dense + BM25 Sparse KG L0 Triple Extraction Entity + Entity Resolution Qdrant Vector Library cosine similarity - hybrid index L1 Leiden / L2 Summary Community subgroups + LLM summaries (batches) Query Input Memory Recall Portal QueryClassifyOp - Intentional Classification Keyword ~0ms Semantic ~5ms Fusion0.4kw + 0.6sem LLM ~500msLow confidence triggers → intent 6 types entity_lookup - factual - conceptual - exploratory - cross_domain - unknown Fast Search - qdrant_search() ✓ Must run - Scoring + Reranking embedded inside search function - services.py:492 Qdrant Hybrid (Dense + BM25 RRF) 11-Stage Scoring Pipeline intent-dependent weight - can be deactivated per stage 1 Recency 2 Import 3 Trust 4 Feedback 5 Length 6 Decay 7 PPR Boost 8 Semantic 9 MinScore 10 Noise 11 Dedup entity_lookup: semantic↑ trust↓ | factual: trust↑↑ | exploratory: recency↑↑ | Weibull 4-tier decay Attention-Gated Reranking Skippable - Jina v3 MLX Attn Gate Jina v3 Score Blend Circuit Brk skip (≤ 2 pieces | lead | tight cluster) → fast_cards Weibull 4-tier: Core 180d - Hot 60d - Warm 30d - Cold 14d - access_count delayed life - PPR graph walk IF thinking_mode = slow : Cascade Recall - deep search ⚡ slow intent before triggering - kg_services.py:991 L2 Summary L1 Community GLOBAL - no scoring L0 Triple PPR Walk LOCAL - no scoring Blocks (qdrant_search) ✓ have scoring + reranking CRAG Eval Access Rerank Optional weight → deep_cards mode=local → jump GLOBAL - mode=global → jump LOCAL - mode=hybrid → full search entity_lookup/factual → local - conceptual/exploratory → global - cross_domain → hybrid Output Formatter format: text - json - cards (entry not relevant) replies Hook - UI - MCP Dream Loop - Memory Integration Pipeline Daily 4AM - dual-gate trigger - 5-phase OrientDouble Door Check + Statistical Snapshot SignalParadox Scan + PPR Center ReflectLLM Reflection + Knowledge Gap ConsolidateMerger/substitution/coexistence decisions Prunelint+stale Clear Slow Thinker Anticipate Next Question - Preflight Cache Interest Profile 7/30/90d Window of Attention Feedback Loop Feedback - Access Reinforce

I. Write Track#

Write Track#

The memory write path is provided bySanitizeGate Coordinate and run three lines of defense before persistence:

  • NoiseFilter: Weed out the uninformative chit-chat.
  • InjectionGuard + Poisoning Detection: Block prompt injection and content poisoning.
  • DedupOp: Use embedding cosine to compare the existing data, if it is similar enough, then just block it out.

After passing, enter theHybrid Indexing: Vector Inventorymemvault.blocks(pgvector 768d), the knowledge map is split intoL0 Triple(subject/predicate/object) write tokg-opsThe Each piece of information carriesProvenance(session_id / turn_index) for easy backtracking, and after first going through theContentNormalizer Unified format to avoid noise polluting the embedding space.

The theoretical basis for this two-track writing comes from the paperDon't Forget to Connect! Improving RAG with Graph-based Reranking(2405.18414). The complete pipeline, the three lines of defense judgments and the KG extraction strategy are documented in the memvault Write Track DetailsThe

Trigger#

Conversations are triggered when they hit the ground, and the background pipeline fire-and-forget.

Session Input
"entry-agnostic entry, any session."
session boundary - not picking a source
after_create Hook
"Write-trigger background pipeline, no blocking in the foreground".
asyncio.ensure_future() - fire-and-forget

Sanitize Gate#

The three sequential gates, non-commutative.

NoiseFilter (G1)
"Cut the pleasantries, the garbage, the low-information-density templates."
core.noise.NoiseFilter - low-info regex + density
InjectionGuard (G2)
"prompt-injection."
5 categories - instruction_override / role_tag / encoded / markdown / separator_flood
Poisoning Detection
"G2 sister component, blocking 5 types of camouflage."
memvault.security.poisoning - authority/role/markdown/temporal/base64 + Shannon entropy
DedupOp (G3)
"4-way decision, high likelihood of LLM arbitration."
SKIP / MERGE / SUPERSEDE / CREATE - cosine + Contradiction

Normalize & Index#

Content normalization + Hybrid dual indexing on the shelf.

ContentNormalizer
"Uniformity in Time Expression, Currency, Units, and Chinese Preprocessing".
libs/text-ops - 4 submodules + preprocess_chinese
Qwen3 Embedding (MLX)
"Compress each memory into a 1024-dimensional vector fingerprint."
omlx_bridge.py - 1024d - persistent subprocess
BM25 Sparse
"Literal index, complementary to dense."
sparse_tokenizer - per-service avgdl
Hybrid Index → Qdrant
"Dense + BM25 RRF Fusion Recall."
qdrant_search - Reciprocal Rank Fusion

KG Construction#

L0 triad extraction + provenance, event-driven parallelism.

L0 Triple Extraction
"(s, p, o) three-piece suite of extracts."
20 predicates - entity normalization
Entity Resolution
'3-tier alias convergence: regularization → embedding → LLM'
3-tier - ChatGPT == GPT
Memory Provenance
"source + trust_score for each triple".
source_tracker - trust_score
Reactive KG Pipe
"event-driven, non-blocking main write".
fire-and-forget - parallel pipeline

II. Background Track#

Background Track#

Offline Organizing byDreamLoop Triggered at 4AM daily, run in thecore/src/modules/memvault/dream/ The five stages of the pipeline:

  • orient: Captures the current snapshot of the memory hierarchy.
  • signal: PolymerizationMultiSignalEdges(access / cite / co-occur).
  • reflect: EnforcementTemporal ConflictThe detection of bi-temporal contradictions.
  • consolidate: produce a hierarchical summary (TierDigest) and merge semantically identical memories.
  • prune: Mark the candidates to be eliminated according to their attenuation scores.

There's a bypass.KnowledgeLint Error correction through four levels of progressive review (diagram structure → grounding → LLM → three-stage validation), andInterestProfile Track the popularity of my topics and give back to theSurpriseDiscoveryThe first thing you need to do is to proactively tap into memories that are low in interaction but high in potential.

This design partially corresponds toHippoRAG: Neurobiologically Inspired Long-Term Memory for LLMs(2405.14831) integration concepts.The Dream Loop five-stage thresholds, conflict resolution strategies, and KnowledgeLint v2 details are documented in the memvault Background Track DetailsThe

Trigger#

Cronicle scheduling + dual condition gate.

Cronicle 4AM
"04:00 daily scheduling to avoid front desk rush".
Cronicle 4105 - daily schedule
Dual-Gate
"Time + Cumulative Volume".
(now - last) > 24h ∧ sessions_since ≥ 5

Dream Loop#

OODA-like five stages: Orient → Signal → Reflect → Consolidate → Prune.

Orient
"tier distribution snapshots, see difference in scale".
tier distribution snapshot
Signal
"PPR seed calculation, look for the signal center".
MultiSignalEdges - PPR seed
Reflect
"LLM 3-way referee + conflict detection."
3-way judge + Temporal Conflict
Consolidate
"merge / supersede / coexist."
merge / supersede / coexist
Prune
"Decay score drive with tier digest."
decay score + tier digest

Knowledge Lint#

Four levels of progressive auditing, graph structure → truthfulness → LLM → human review.

Four Levels of Progression
"graph → grounding → LLM → human".
lint v2 - volatile predicate
MultiSignalEdges
"5 signals fusion replaces pure emergence."
co-occurrence + session + Adamic-Adar + type + semantic
Temporal Conflict
"LLM 3-way judge, upgradable to RLM."
MERGE / SUPERSEDE / COEXIST - escalate to RLM
Review Queue
"auto-approve + human-in-loop fallback."
low-confidence → human queue

Interest & Discovery#

Attention Layering + Interest/Gap/Surprise Tapping + tier descending.

Attention Windows
"Three levels of attention for 7/30/90 days."
active / historical / fading
Daily Snapshot
"top intents/entities of the day."
SQL aggregation - top intents/entities
Knowledge Gap
"verdict=INCORRECT Cumulative Marking Gap."
verdict=INCORRECT ≥ 2
Surprise Discovery
"cross-community bridge / indirect-strong / knowledge-gap."
surprise_ops - Leiden weighted triggers
Attitude + Skill Layer
"Category 10 attitude version chain + L1/L2/L3 skill".
attitude version chain + skill levels
Tier Digest
"warm→cold→frozen one-way LLM summary".
summary ≤ 400 chars - irreversible

III. Read Track#

Read Track#

The entry point to read the path isQueryClassifyIt chooses different strategies depending on the type of question, but the backbone is an 11-stage scoring pipeline:

  • QueryRouter: Divide the questions into factual, associative, and chatty.
  • ScoringPipeline: Combining BM25, sense retrieval, and knowledge map jumps with parallel scoring, AttnRes intent-dependent weighted merging.
  • CascadeRecall: When a vector search fails to find a vector, it will automatically drill down into the knowledge graph for multi-step inference.
  • Reranker: Reordering using Jina Reranker v3 MLX.
  • OutputFormatter + ReadTimeSanitize: Run another injection check before shipping.

There's another one in the background.SlowThinkerIt listens to the conversation's SSE stream, predicts the next query, and prefetches the candidate block; if it guesses, it saves the cost of a full pipeline.

where the theoretical basis for intent-dependent weighted mergers comes from the paperAttention as a Hint: Detecting Irrelevant Contexts via Attention Weight(2603.15031). The three-way shunt criterion, the 11-stage scoring function, and the SlowThinker prediction model were memvault Read Track Details Inside.

Query Routing#

Intent classification + attention prior + layer routing + HyDE.

QueryClassify
"6 intent, kw ∥ sem parallel, low confidence send LLM"
6 intent - kw ∥ sem → LLM (low conf)
Personalized Router
"Attention prior Preference to common themes"
attention prior - 7/30/90
Layer Routing Matrix
"intent vs. 4 search strategies
intent → SEMANTIC / HYBRID / ILIKE / SKIP
HyDE Expansion
"Mr. Moses has made a hypothetical answer and then embedded it for comparison."
Hypothetical Document Embeddings

Hybrid recall + 11-stage weighting + intent-tuned blending + Redis cache.

qdrant_search
"hybrid Dense + BM25 RRF."
qdrant_search - hybrid + RRF
11-Stage Scoring
"11-Level Weighted Composite Final Score
recency / import / trust / feedback / length / decay / PPR / semantic / minscore / noise / dedup
Score Blending
"intent-tuned ratio."
default 0.3/0.7 - intent-tuned
Embedding Cache
"Redis Cache Query Embedding"
Redis - 24h TTL

Reranking#

Cross-encoder 2-stage fine-tuning + skip + circuit breaker.

Jina Reranker v3
"MLX cross-encoder".
MLX - 0.6B - cross-encoder
Attention Gate
"3 skip rules, TurboQuant+ inspired."
3 skip rules - TurboQuant+ inspired
Circuit Breaker
"3 failures trigger 600s recovery."
3 failures → 600s recovery

Cascade Recall#

Slow-only five moves: L0 → L1 → L2 → PPR → CRAG.

L0 Triple
"Three-piece direct neighbor inquiry."
(s, p, o) lookup
L1 Community
"Leiden community detection."
Leiden detect
L2 Summary
"LLM pre-generated summaries, zero latency."
pre-generated - zero-latency
PPR Walk
"igraph PPR, HippoRAG Inspire"
HippoRAG-inspired igraph PPR
CRAG Verification
"4-file self-check: default / deep / rlm / none".
evaluate=default/deep/rlm/none

Output#

Budget-aware shaping + shipping side sanitize + Slow Thinker bypass.

Output Formatter
"packing in token budget"
format=text/json/cards - budget-aware packing
Read-Time Sanitize
"Sterilize the shipping side again."
is_unsafe_for_injection + sanitize_for_injection
Slow Thinker (bypass)
"5-op pipeline in preparation for the next round."
5-op pipeline - Admission Control - VoiceAgentRAG-inspired

Four, three tracks through the design layer#

The three tracks share a common design layer - CLT as the philosophical frame, 4 Event Flows as the self-driven backbone, and Reactive abstraction as the function synthesis base.

Cross-cutting#

CLT triple principle + 4 event flows + Reactive abstraction.

CLT Principles of Trigonometry
"Cut redundancy/break down complexity/calibrate paradigms to sanitize/segment/verify."
Sweller 1988 - Cognitive Load Theory
4 Event Flows
"Four event-driven pipelines, each with its own role."
Memory→KG / Capture→KG / Intelligence→Block / Query→SlowThinker
Reactive abstraction
"Operator / Pipeline / Observable / Subject 4-pack".
libs/ops-core - RxJS-inspired

The three tracks of deep dive are scattered in the trilogy. This post brings them together under one panorama, and pulls out the design layers that run through the whole picture: CLT is my design philosophy, 4 Event Flows is the self-driven backbone, and Reactive abstraction is the basis for function synthesis.

The ones that are not written in are mostly parameters that need to be tuned over time - scoring weights, prefetch hit rates, cascade bounds, knowledge graph extraction reliability thresholds. These values are fine-tuned monthly as the data is distributed, and are not considered new components.

Of course, no tool can be used forever. Theory, technology, and tools are all still breaking through, and memvault will continue to iterate along with them. This blog will serve as an inventory of the progress so far - if there are any major revisions in the future, we'll explain them in a new post. memvault trilogy, here it is for now.

Take this away.

Panorama View Tips for AI Agent#

If you're building your own personal memory system / RAG / AI assistant, post this to your AI assistant and ask it to look at your design in terms of its overall architecture to see if it stands up.

Help me to view my current personal memory / RAG system design in a panoramic view from the perspective of "Three Tracks × Three Levels of Mind". The current state of my system: - Write-in mechanism: [Briefly describe the processing before entering memory]. - Contextualization: [Do you regularly organize your memories? How to decide what to forget?] - Recall Path: [How do I get from a question to an answer?] Please help me diagnose from the following three levels: A. Completeness of the three tracks 1. Write track: Have you filtered noise, blocked fake injections, and de-duplicated the three gates? Has memory been split into vector + knowledge map tracks at the same time? 2. Background track: Is there an asynchronous organizing pipeline, and how to identify outdated, contradictions, orphans? Can the system "learn to forget"? 3. Recall track: Is there any graphical categorization? Does it integrate multi-signal sorting? Is there any predictive prefetching (prepared before the user opens the mouth)? B. Three-Layer Cross-Track Mentality 4. Cognitive Load: Is each process lowering the load for the downstream? Is there a "cut redundancy/remove complexity/calibrate paradigm" rule? 5. Event flow: Is the system driven by events, not by people pushing buttons? Are the four typical event flows (memory-to-knowledge graph, external capture-to-graph, intelligence-to-memory block, query-to-fetch) implemented accordingly? 6. reactive abstraction: is the pipeline composed of the smallest unit of operators? Is the event backbone a pub-sub model? How much does this affect future extensions? C. Integration aspects 7. How do the three tracks communicate with each other? Is it a direct import or event decoupling? 8. Is the metadata (confidence / provenance / valid_at) typed on the write side consumed during contextualization and recall sorting? 9. Which details are long term references rather than new components? Would it be better to maintain them separately from the core design? Make a list of what I'm currently missing, and then tell me the top three things I need to fix and prioritize them.

Extended Reading#

Trilogy In Depth Chapter + Theoretical Roots of Cross-Track Mindfulness.

Resources Why is it important?
memvault Write Track - after the conversation ends The first in-depth version of the trilogy. Three gate (noise / injection + disguise / de-duplication) + Dual write (vector + KG) + Provenance + Content Normalizer 4 submodules
memvault Background Track - After saving Trilogy Part II In-Depth Edition. 12 elements of Dream Loop 5 Stages, Knowledge Lint 4 Stages, Multi-Signal Edges, Temporal Conflict, Tier Digest, Surprise Discovery, and more!
memvault Read Track - After Thinking About It The third in-depth version of the trilogy. Intentional Sorting + Personalized Router + 11-stage scoring + Cascade Recall + Read-Time Sanitize + Slow Thinker prefetching
Cognitive Load Theory (Sweller, 1988) Three tracks sanitize / segment / verify the roots of the design philosophy of the three gates - write a little harder, read the talent is not tired!
Enterprise Integration Patterns (Hohpe & Woolf, 2003) 4 Event Flows' pub-sub / pipe-and-filter design pattern source. memvault builds the EventBus and Subject abstractions on top of this pattern.
RxJS Reactive Extensions Operator / Pipeline / Observable / Subject are four abstraction specific references. memvault doesn't use RxJS directly, but the abstraction level is aligned to it.
✦ Copy Prompt