Aevrion is a paradigm for distributed data: every mutation is an immutable event on a temporal axis. No coordination. No leader. No consensus. Convergence is inevitable.
Distributed data diverges. Two nodes write the same key. Network partitions. Clocks drift. The fundamental challenge: how do independent nodes arrive at the same state without coordination?
Traditional approaches demand a leader, a quorum, a consensus protocol. They trade availability for consistency, add latency, create single points of failure. What if convergence didn't require any of that?
Time is the only coordinate you need.
If every mutation is an immutable event with a monotonic timestamp, and events are organized in a tree indexed by time, then any two nodes can deterministically compute the same state from the same set of events. No negotiation. No voting. The math guarantees it.
Aevrion is built on one primitive: the Event. Three fields determine identity. One hash makes it content-addressable. Everything else — storage, indexing, sync, conflict resolution — is derived.
The smallest unit of truth. Immutable. Content-addressable. Verifiable.
Three fields define identity: key, value, timestamp. The ID is blake3(key ‖ value ‖ timestamp) — a hash of the content. Same content on different nodes = same ID. No coordination needed.
Content-addressable means deduplication is free. Event already exists? Skip it. Verification is instant — recompute the hash, compare with the ID. Tampered? Hash won't match.
Deletion is an event, not a special case. An event with value: None means "this key is deleted as of this timestamp". The event persists — the delete is a fact in history, not an erasure.
node_id in the event. Two nodes that independently write the same key, value, and timestamp produce the same event, the same ID. Origin is metadata, not identity.Every put() creates an event that ripples through the entire system.
put("user:123", { name: "Alice" })Event is written once. Everything else is a reference by ID.
Flat content-addressable storage. id → Event. The only place where event data lives. Put, get, has. That's it.
Organizes events by time. Leaves store lists of IDs, not events. Hash of a leaf = blake3(sorted(event_ids)). Enables O(log N) diff between any two nodes.
The only component clients see. Maps key → latest event. Full history per key. Point-in-time queries. Scan, prefix search.
Channel + handler + triggers per neighbor. One function: handle(msg). Seven message types. Repair emerges as a chain reaction.
Tree is organized by time. Client asks by key. Merge Index connects both worlds.
get(key)
Latest version of any key. O(log N).
get_history(key)
Full version history. Every mutation, in order.
point_in_time(key, T)
Value of key at any moment in the past.
Granularity G sets the leaf size. Branching factor B sets how many children per node. Everything else is derived.
leaf_id = timestamp / Glevel_N = leaf_id / BᴺOne snapshot covers all history. Recent data with minute precision. Old data in month-wide blocks.
Streets nearby, districts further away, cities on the horizon. One snapshot gives a complete divergence map — recent data with minute-level precision, historical data in broad strokes. O(B × levels) hashes.
Matching blocks are skipped entirely. A green level 3 block means 22 days verified with a single hash comparison. Only diverged branches are explored deeper. Convergence cost is proportional to actual drift, not total history.
Snapshot → diff → repair. Chain reaction. No coordinator.
Local write → event is immediately sent to every neighbor. Best-effort, instant delivery. If it arrives — great. If not — verify will catch it.
Every G, each node sends a tree snapshot to neighbors — hashes at every level, from fresh leaves to root. The closer to now, the more detail. Like a map: streets nearby, districts further, cities on the horizon.
≤160 hashes · ~6.5 KB · one round-tripReceiver compares each hash with their own. Match = entire subtree verified. Mismatch at leaf = divergence pinpointed to the minute. Mismatch at level 3 = divergence somewhere in 22 days — drill down.
Mismatch on leaf → EventList → NeedEvents → events flow. Not a mode — a chain reaction of handle() calls. Both sides repair simultaneously. Matching subtrees are never touched.
All hashes match. Same events, same tree, same merge index state. Every node arrived here independently, following the same deterministic rules. No vote. No leader. Just time and math.
LWW: higher timestamp wins · tie: higher event_id winsDistributed systems don't fail. They diverge. The question is how you detect and repair it.
Every distributed system faces the same fundamental challenge: nodes drift apart. Network partitions, delayed writes, clock skew, crashed replicas — entropy is not a bug, it's physics. The difference between systems is how they manage it.
Most systems treat anti-entropy as an operational burden — a maintenance task you schedule, monitor, and pray completes before the next one starts. Aevrion treats it as a built-in primitive — continuous, automatic, and proportional to actual drift.
Prevent divergence entirely by coordinating every write through an elected leader. Used in etcd, CockroachDB, TiKV, ZooKeeper.
Every write requires quorum acknowledgment before it's considered committed. Divergence is structurally impossible — but at the cost of availability and latency.
All writes are serialized through a single leader. The leader replicates to a majority before confirming. Leader failure triggers election — writes stall for seconds.
CAP theorem in action: the minority partition is read-only (or unavailable). Network split = degraded service. Cross-datacenter deployments pay the latency tax on every write.
No anti-entropy repairs needed (consistency is built-in), but leader elections, log compaction, and snapshot management require monitoring and tuning.
Allow divergence during normal operation, fix it later with scheduled Merkle tree comparisons. The most widely deployed anti-entropy mechanism.
Merkle trees are built on-demand during repair. Not maintained continuously — rebuilt from scratch on each nodetool repair invocation. Expensive.
Full repair scans every token range regardless of actual drift. Subrange repair exists but requires manual partitioning. Incremental repair adds flags, state, and its own failure modes.
Miss a repair cycle → tombstones resurrect deleted data. gc_grace_seconds is a ticking clock. Repair timeouts, vnodes, compaction interaction — each a source of incidents.
Repair is I/O and network intensive. Running it during peak hours degrades read/write latency. Not running it risks silent inconsistency. Lose-lose scheduling problem.
Encode merge semantics into the data type. Concurrent writes are automatically resolved by mathematical properties of the type (commutativity, associativity, idempotence).
Instead of detecting and repairing divergence, CRDTs ensure that any merge order produces the same result. Elegant in theory — constrained in practice.
G-Counters, PN-Counters, OR-Sets, LWW-Registers. Arbitrary KV with full history, point-in-time queries, and prefix scan doesn't map naturally to CRDT semantics.
Version vectors, causal dots, tombstone sets grow with cluster size. Each key carries per-node metadata. At scale, metadata can exceed payload size.
Custom merge functions require formal proofs of convergence. Edge cases in concurrent delete+add, move operations, and nested types are a research-level problem.
Entropy is not prevented or tolerated — it's continuously measured and repaired. The temporal Merkle tree is always maintained, the progressive digest is always exchanged, convergence is always in progress.
Every G, nodes exchange a progressive digest — ~160 hashes covering all history. Recent data at minute precision, old data in month-wide blocks. Divergence is detected within one cycle.
Matching subtrees are skipped entirely. One hash comparison verifies 22 days of data. Repair touches only diverged branches. Week-long partition? Only the week's events are synced.
No nodetool repair. No gc_grace_seconds. No scheduled maintenance windows. Anti-entropy is a heartbeat, not a job. If the node is running, it's converging.
Last-Write-Wins by timestamp, tiebreaker by event ID. No version vectors, no merge functions, no causal context. Events are pure data. Any node resolves the same conflict the same way.
Cassandra asks: "did you remember to run repair?"
Raft asks: "who is the leader right now?"
Aevrion doesn't ask. Repair is always running.
Events are never modified or deleted. History is a fact. Every derived structure can be rebuilt from the event log.
Time is the single organizing dimension. One integer division gives the path from root to leaf. Zero coordination.
ID = hash of content. Same event on any node has the same ID. Deduplication, verification, and routing — all free.
Every node works independently. Reads and writes never block on the network. Sync is a background process.
Merge Index keeps every version of every key. Point-in-time queries, audit trail, and rollback — by design, not as an afterthought.
Push is optimistic. Verify is pessimistic. Together they guarantee that all connected nodes reach the same state. Always.
aevum + -ion — a particle of eternity.
Aevum (Latin) — eternity, the continuous flow of time. The root behind aeon, ever, age. An append-only log means data is written for eternity. A Temporal Merkle Tree means time is the organizing axis. Point-in-time queries mean access to any moment that ever was.
-ion (Greek suffix) — a particle, the smallest indivisible unit. Like photon, electron, graviton — fundamental quanta that carry a force. An event is the quantum of change: the smallest, indivisible unit of mutation that carries truth across the system.
Aevrion is a quantum of eternity — an immutable particle of data that travels through time and inevitably reaches every node.
Explore the concepts, read the architecture, or start building.