aegis-replication

Distributed Replication and Raft Consensus for Aegis Database Platform.

Overview

Raft-based consensus and replication for distributed Aegis deployments. Provides leader election, log replication, and distributed state machine.

Modules

node.rs

Node identification and management:

  • NodeId - Unique node identifier
  • NodeStatus - Starting, Healthy, Suspect, Down, Leaving, Left
  • NodeRole - Follower, Candidate, Leader
  • NodeInfo - Address, port, status, role, metadata
  • NodeMetadata - Datacenter, rack, zone, tags, capacity
  • NodeHealth - Health check result with metrics

log.rs

Replicated log for Raft consensus:

  • LogIndex - Index in the replicated log
  • Term - Raft term number
  • LogEntry - Entry with index, term, type, data, timestamp
  • EntryType - Command, ConfigChange, NoOp
  • ReplicatedLog - Append, get, commit, truncate, compact operations
  • Conflict detection for log reconciliation

state.rs

State machine abstraction:

  • Command - Get, Set, Delete, CompareAndSwap, Increment, Custom
  • CommandType - Operation type enum
  • CommandResult - Success/error with value and applied index
  • StateMachine - Key-value store with versioning
  • Snapshot - State machine snapshot for recovery

raft.rs

Core Raft consensus algorithm:

  • RaftConfig - Election timeout, heartbeat interval, snapshot threshold
  • RaftState - Persistent state (term, voted_for, commit_index)
  • VoteRequest / VoteResponse - Leader election
  • AppendEntriesRequest / AppendEntriesResponse - Log replication
  • InstallSnapshotRequest / InstallSnapshotResponse - Snapshot transfer
  • RaftNode - Full Raft implementation with election and replication

cluster.rs

Cluster coordination:

  • ClusterConfig - Min/max nodes, heartbeat, failure timeout, quorum
  • ClusterState - Initializing, Forming, Healthy, Degraded, NoQuorum
  • Cluster - Node management, health tracking, leader management
  • ClusterStats - Cluster health statistics
  • MembershipChange - Add, remove, update node operations

transport.rs

Network transport layer:

  • MessageType - VoteRequest, AppendEntries, Heartbeat, etc.
  • Message - Raft protocol message with payload
  • MessagePayload - Serialized request/response data
  • ClientRequest / ClientResponse - Client operations
  • Transport trait - Send, receive, broadcast interface
  • InMemoryTransport - Testing transport
  • ConnectionPool - Peer connection management

engine.rs

Main replication engine:

  • ReplicationConfig - Combined Raft and cluster config
  • ReplicationEngine - Coordinates Raft and cluster
  • TickResult - Event loop tick output
  • State machine operations (propose, get, set, delete)
  • Message processing for distributed communication

Usage Example

use aegis_replication::*;

// Create replication engine
let node = NodeInfo::new("node1", "127.0.0.1", 5000);
let mut engine = ReplicationEngine::new(node, ReplicationConfig::default());

// Add peers
let peer = NodeInfo::new("node2", "127.0.0.1", 5001);
engine.add_peer(peer)?;

// Become leader (after election)
let request = engine.start_election();
// ... handle votes from peers ...

// If leader, propose commands
if engine.is_leader() {
    let index = engine.set("key1", b"value1".to_vec())?;

    // Apply committed entries
    let results = engine.apply_committed();
}

// Read values
let value = engine.get("key1");

Raft Node Example

use aegis_replication::*;

// Create Raft node
let node = RaftNode::new("node1", RaftConfig::default());
node.add_peer(NodeId::new("node2"));
node.add_peer(NodeId::new("node3"));

// Start election
let vote_request = node.start_election();

// Handle vote response
let vote_response = VoteResponse {
    term: 1,
    vote_granted: true,
    voter_id: NodeId::new("node2"),
};
let became_leader = node.handle_vote_response(&vote_response);

// If leader, propose commands
if node.is_leader() {
    let command = Command::set("key", b"value".to_vec());
    let index = node.propose(command)?;
}

// Apply committed entries
let results = node.apply_committed();

Cluster Management Example

use aegis_replication::*;

// Create cluster
let local_node = NodeInfo::new("node1", "127.0.0.1", 5000);
let config = ClusterConfig::new("my-cluster")
    .with_replication_factor(3)
    .with_heartbeat_interval(Duration::from_secs(1));

let cluster = Cluster::new(local_node, config);

// Add nodes
cluster.add_node(NodeInfo::new("node2", "127.0.0.1", 5001))?;
cluster.add_node(NodeInfo::new("node3", "127.0.0.1", 5002))?;

// Check cluster health
let stats = cluster.stats();
println!("Healthy: {}, Has quorum: {}", stats.healthy_nodes, stats.has_quorum);

// Handle heartbeats
cluster.heartbeat(&NodeId::new("node2"));

// Check for failures
let failed = cluster.check_failures();

Tests

49 tests covering all modules:

  • Node management and health
  • Replicated log operations
  • State machine commands
  • Raft leader election
  • Log replication
  • Cluster coordination
  • Transport messaging
  • Replication engine