What we built
Not a prototype.
Running infrastructure for verifiable enterprise intelligence. Every number on this page comes from production systems.
960K+
Lines of Go
295+
Database tables
28
Decision domains
5,025+
Test functions
36,882
Verified decisions
896
Concurrent agents
1,617
Tenant isolation policies
195
API endpoints
We are not artificial humans.
We are artificial octopuses.
The octopus: two-thirds of neurons distributed across the arms. Each arm thinks independently. Coordination emerges. It is not commanded.
ARKIVIST implements distributed cognition across 28 independent specialist domains. There is no central controller. Intelligence emerges from below.
First Wave
Symbolic AI
Logical. Rule-based. Brittle. Expert systems that couldn't handle ambiguity.
Second Wave
Neural AI
Fluent. Flexible. Hallucinates. LLMs that can't prove anything.
Third Wave
Neuro-Symbolic Fusion
Distributed. Grounded. Evolving. Symbolic reasoning precedes neural generation. Every answer traceable to its source. Contradictions surfaced, not hidden.
This is what we built. This is ARKIVIST.
896 agents. Natural selection. Hard data.
Every agent starts with a finite confidence budget. Every assertion costs something. Overconfident agents exhaust themselves and die. Well-calibrated agents survive, reproduce, and pass wisdom to their successors.
The system gets smarter by knowing what it doesn't know.
65.8%
Rejection rate
Agents learned to distrust unreliable sources. They reject two out of three AI suggestions.
93.3%
Accuracy at 90% confidence
When an agent says 90%, it means it. Calibration gap: 3.3%.
1.04
Avg votes per decision
Maximum information per unit of budget. The evolutionary equivalent of metabolic efficiency.
Population dynamics
Gen 1 — 8 agents — 2,794 decisions (bootstrap)
Gen 7 — 24 agents — 13,613 decisions (peak swarm)
Gen 9 — 7 agents — 3,236 decisions (selection pressure)
Gen 25 — 1 agent — 6 decisions (distilled expert survivor)
33 agents born. 32 died. 1 expert remains. The survivor carries accumulated wisdom of all predecessors.
We don't benchmark on datasets.
We benchmark on reality.
Chess Evolution
Five specialist agents playing real chess.
Tactician, Positional, Endgame, Opening, Risk — each with its own confidence budget, its own wisdom patterns, its own evolutionary lineage. 526 games. 30,438 logged move decisions. Every single move contains agent ID, confidence, consensus score, and ground truth evaluation.
69.0%
Accuracy vs Stockfish
30,438
Move decisions logged
34 MB
Decision data (JSONL)
Coding Evolution
Agents learning to code through natural selection.
5.27 million exercises from competitive programming. Agents evolve their own coding strategies — temperature tuning, style selection, custom prompt fragments — all through selection pressure. Same evolutionary mathematics as chess. Different domain, same convergence.
5.27M
Exercise corpus
Go + Rust
Languages
5 bands
800–3200 rating
Legislative Intelligence
Entire legal corpora. Deterministic. Verified to public ledger.
British Columbia's complete legislative corpus and all 956 federal Canadian laws ingested into the knowledge graph. Zero AI hallucination — deterministic extraction from government XML with cryptographic verification. Every claim anchored to Hedera with Merkle proof.
It sleeps.
Every intelligent biological system sleeps. Not as a luxury — as a survival mechanism. An always-on AI that never sleeps is a system on a methamphetamine binge. Quality degrades. Noise accumulates. The system becomes brittle.
Phase 1
Replay
Replay decisions from the active cycle. Reinforce patterns from correct decisions. Decay what didn't work. Recalibrate confidence.
Phase 2
Prune
Archive unverified claims. Merge near-duplicate entities. Surface unresolved contradictions. Harvest dead agents. Less total weight, higher signal-to-noise.
Phase 3
Clean
Drain queues. Purge temporary data. Invalidate stale caches. Compact logs. Baseline capture for post-wake comparison. The hardware itself is cleaned.
Phase 4
Dream
Cross-domain wisdom transfer. Patterns that worked in one domain are tested against candidates from other domains. Novel connections are retained as speculative hypotheses. A pattern learned in chess might apply in legal reasoning.
Phase 5
Self-Authorship
The system synthesizes everything it learned during sleep. It identifies its own cognitive gaps — accuracy plateaus, domain blind spots, persistent contradictions — and articulates what it needs to evolve.
Agents teach agents.
During sleep, the system doesn't just consolidate — it creates. Agents synthesize reusable skill atoms from their experience. Draft. Check. Refine. Finalize.
Composable skills.
Each skill is an atomic unit of judgment that can be composed with others. Skills carry preconditions, confidence scores, and usage tracking. When a skill leads to correct decisions, it's rewarded. When it fails, it's pruned.
Failure drives discovery.
When agents fail, the system doesn't just record the failure — it performs causal analysis. What went wrong? What pattern would have caught this? That pattern becomes a new skill, forged from failure, validated through subsequent decisions.
Mastery curves.
The system tracks its own mastery across every domain. Domains where it's weakest get proportionally more dream time. The system focuses synthesis effort where it's needed most. Self-directed learning.
The system knows what it doesn't know.
Arousal gating.
When uncertainty is high, the system freezes writes. It gathers more intelligence before acting. Like the Yerkes-Dodson law in psychology — there's an optimal arousal level for performance. Too low: the system is sluggish. Too high: it panics and makes errors.
Contradiction as feature.
The world is inconsistent. Systems that hide contradictions are lying. ARKIVIST surfaces every conflict. Two sources disagree? Both are shown with their evidence, their confidence, their provenance. You decide.
Compliance timetravel.
What did the system know on March 15th, 2025? Exactly what it knew. Temporal knowledge state reconstruction for regulatory and legal compliance. Every fact has a time range. Every answer is timestamped.
Five layers. Raw to trustless.
Raw
Just extracted. Unverified. Available for internal queries.
Corroborated
Multiple independent sources agree. Higher confidence.
Verified
Agent-verified through calibrated decision-making. Trusted for automation.
Expert
Confirmed by an agent with 95%+ accuracy over 20+ verified decisions.
Anchored
Cryptographically sealed to Hedera Consensus Service. Independently verifiable. Federation-ready.
Claims ascend through layers as evidence accumulates. Each transition is recorded with full provenance — which agent, which decision, what confidence, and for L5, the Hedera transaction ID and consensus timestamp. Cryptographic proof replaces institutional trust.
It watches itself.
The system periodically inspects its own operational health. Agent populations. Decision quality. Knowledge graph integrity. Skill hive coverage. Sleep cycle effectiveness. It synthesizes findings and applies bounded parameter adjustments through a closed feedback loop. One external AI call per cycle. Everything else is math.
Guardian Angel.
Real-time intelligence in the field. Computer vision on edge devices. Voice pipeline with live transcription. Risk assessment through the same evolutionary agents that govern the knowledge graph. Evidence chain integrity from capture to court.
Where this goes.
Now
Neuro-Symbolic Intelligence
Knowledge graphs with evolutionary agent governance. Symbolic reasoning before neural generation. 28 decision domains. 36,882 verified decisions. This is what runs today.
Next
Hypergraph Transformers
Our quadruple model — Entity, Relationship, Entity, Context — is already a hypergraph. Higher-order relations, not simple triples. When hypergraph transformer models mature, ARKIVIST doesn't rebuild. Our data is already in the right shape. Published research shows 15-30% improvement on relational reasoning.
We've already built the dual hypergraph architecture — separating factual knowledge from epistemic boundaries. The environment graph holds what is known. The constraint graph holds what the system knows about the limits of its own knowledge. Metacognitive self-awareness, encoded in graph structure.
Then
Liquid Neural Networks
Neural networks whose weights are governed by differential equations, adapting in real-time. MIT CSAIL research. Our evolutionary agent dynamics already implement adaptive decision functions — the transition from heuristic Lagrangians to continuous learned dynamics is architectural, not revolutionary. Same interface. Same verifiability. Orders of magnitude fewer parameters.
Beyond
Quantum-Accelerated Verification
Multi-agent consensus is polynomial optimization. Exactly what quantum computers excel at. Exponentially faster verification at federation scale. The mathematical framework — Lagrangian optimization, variational methods, continuous dynamics — is consistent across all four phases.
It's not four pivots. It's one deepening.
The substrate stays.
The intelligence deepens.
You don't need quantum to win. The neuro-symbolic substrate is the product today. Everything else is the moat deepening.
28 autonomous domains. Running now.
Entity deduplication
Claim verification
Contradiction resolution
Field mapping
Multi-agent consensus
Voice interaction
Workflow orchestration
Document classification
Sensitivity detection
Source authority
Relationship inference
Hallucination detection
Citation validation
Federation trust
Guardian risk assessment
Report orchestration
Sleep orchestration
Security posture
Each domain has its own population of agents, its own wisdom patterns, and its own evolutionary trajectory. Domains evolve independently but cross-domain wisdom transfer occurs during sleep cycles.
"Fluency without verifiability is worthless."
Every LLM in the world generates fluent text. None of them can prove they're right.