Storage System

Content-addressed blob store with RocksDB indexing, Reed-Solomon erasure coding, Merkle proof verification, and on-chain challenge/response auditing.

Architecture

Module Overview

The aleph-storage crate exports:

Module	Key Types	Purpose
`engine`	`StorageEngine`	Core blob store: put, get, delete, exists
`index`	`StorageIndex`, `BlobMetadata`	RocksDB index for content-addressed lookup
`cache`	`ContentCache`, `EvictionPolicy`	In-memory LRU/LFU cache layer
`cached_engine`	`CachedStorageEngine`	Engine + cache composition
`chunking`	`ChunkingEngine`, `ChunkManifest`	Fixed-size chunking with manifests
`merkle`	`MerkleTree`, `MerkleProof`	Merkle tree construction and proof generation
`proofs`	`StorageProofGenerator`, `ChallengeResponder`	On-chain storage proof generation
`replication`	`ErasureEncoder`, `ReplicationManager`	Reed-Solomon encoding and shard placement
`ipfs`	`CidV0`, `CidV1`, `IpfsGateway`	IPFS CID compatibility and gateway
`gc`	Garbage collection	Cleanup of unreferenced blobs

Content Addressing

All data is stored by its content hash (SHA-256). The StorageEngine provides the core interface:

pub trait StorageEngine {
    async fn put(&self, data: &[u8]) -> Result<ContentHash>;
    async fn get(&self, hash: &ContentHash) -> Result<Vec<u8>>;
    async fn exists(&self, hash: &ContentHash) -> bool;
    async fn delete(&self, hash: &ContentHash) -> Result<()>;
    async fn size(&self, hash: &ContentHash) -> Result<u64>;
}

Chunking

Large files are split into fixed-size chunks (default: 256 KiB, configurable between MIN_CHUNK_SIZE and MAX_CHUNK_SIZE). Each chunk is stored independently and tracked by a ChunkManifest:

pub struct ChunkManifest {
    pub content_hash: ContentHash,   // hash of original file
    pub chunks: Vec<ChunkInfo>,
    pub total_size: u64,
}

pub struct ChunkInfo {
    pub hash: ContentHash,
    pub offset: u64,
    pub size: u32,
}

Merkle Trees & Proofs

Each storage commitment generates a Merkle tree from chunk hashes. The root is committed on-chain in the StorageRegistry contract.

// Build Merkle tree from chunks
let tree = MerkleTree::from_leaves(&chunk_hashes);
let root = tree.root();

// Generate proof for a specific chunk
let proof = tree.proof(chunk_index);

// Verify proof (done on-chain via StorageRegistry)
assert!(proof.verify(root, leaf_hash));

Erasure Coding

Reed-Solomon encoding provides data redundancy. Data shards are distributed across nodes for fault tolerance.

// Encode with Reed-Solomon
let encoder = ErasureEncoder::new(
    data_shards,    // e.g., 4
    parity_shards,  // e.g., 2 (tolerates 2 failures)
);

let shards: Vec<Shard> = encoder.encode(&data)?;

// Place shards across nodes
let placements = ReplicationManager::place_shards(
    &shards,
    &available_nodes,
    replication_factor,
)?;

Challenge/Response

Storage providers must prove data possession via on-chain challenges:

Challenger issues a challenge with a random seed via StorageRegistry.issueChallenge()
Node computes the challenge response using the seed to select which chunks to prove
Node submits Merkle proof via StorageRegistry.respondToChallenge()
Contract verifies proof on-chain using MerkleProof.verify()
Failed challenges trigger slashing via StakingManager.slash()

// Challenge response flow (node side)
let responder = ChallengeResponder::new(&storage_engine);

let response: ChallengeResponse = responder
    .respond(challenge_seed, commitment_id)
    .await?;

// Submit proof on-chain
storage_registry
    .respondToChallenge(
        challenge_id,
        response.proof,
        response.leaf,
    )
    .send().await?;

Caching

The CachedStorageEngine wraps the base engine with an in-memory cache supporting LRU and LFU eviction policies:

let cache = ContentCache::new(CacheConfig {
    max_size_bytes: 512 * 1024 * 1024, // 512 MB
    eviction_policy: EvictionPolicy::LRU,
});

let engine = CachedStorageEngine::new(base_engine, cache);

Previous ← Compute Next Networking →