g4titan

Using Merkle Patricia Tree to verify Transfer Events in Mirage's Escrow Contract

This blog explains the mathematics behind the Merkle Patricia Trie (MPT) proof used in verifying ERC-20 transfers in Mirage's Escrow contract.

Model

Let nibble be a 4-bit value which can represent any of the hexadecimal digits {0..f}. Now, let a key k represent a finite sequence of nibbles. Then we describe a node N as either a:

A node is RLP‑encoded as RLP(N) and its hash can be described as H(N) := keccak256(RLP(N)).

Child reference invariant: In Ethereum's Patricia trie, a child node can be referenced in two ways:

This optimization reduces storage overhead for small nodes while maintaining cryptographic integrity for larger ones. The 32-byte boundary is exact - if len(RLP(N_child)) == 32, the child is still referenced by hash H(N_child), not inlined.

// Handle child reference resolution
bytes32 hash;
if (nextHash.length == 32) {
    // Hash reference - load 32 bytes as hash
    assembly {
        hash := mload(add(nextHash, 32))
    }
} else {
    // Inline reference - hash the embedded data
    hash = keccak256(nextHash);
}

Hex‑Prefix (Compact) Encoding

A nibble path s = s0 s1 … s_{m−1} and a flag isLeaf ∈ {0,1} are encoded as bytes using a leading flag nibble. Let isOdd := m mod 2. The first (high) flag nibble is (isLeaf ? 2 : 0) + (isOdd ? 1 : 0).

Flag Examples:
Extension node, even path length: flag = `0`
Extension node, odd path length: flag = `1` 
Leaf node, even path length: flag = `2`
Leaf node, odd path length: flag = `3`

If isOdd = 1, the next half‑byte is s0 and the remaining bytes pack s1 … by pairs; if isOdd = 0, a zero half‑byte is inserted and the remaining bytes pack s0 … by pairs. Then decoding returns: (isLeaf, segment) where segment is the exact nibble sequence represented.

function extractKeyFromNode(bytes memory nodeKey) private pure returns (bytes memory) {
    if (nodeKey.length == 0) return nodeKey;

    uint8 firstByte = uint8(nodeKey[0]);
    bool isOdd = (firstByte & 0x10) != 0;  // Check odd flag

    bytes memory result;
    if (isOdd) {
        // Odd length: first nibble in low 4 bits of first byte
        result = new bytes(nodeKey.length * 2 - 1);
        result[0] = bytes1(firstByte & 0x0f);
        for (uint256 i = 1; i < nodeKey.length; i++) {
            result[i * 2 - 1] = bytes1(uint8(nodeKey[i]) >> 4);
            result[i * 2] = bytes1(uint8(nodeKey[i]) & 0x0f);
        }
    } else {
        // Even length: skip first byte's low nibble
        result = new bytes(nodeKey.length * 2 - 2);
        for (uint256 i = 1; i < nodeKey.length; i++) {
            result[(i - 1) * 2] = bytes1(uint8(nodeKey[i]) >> 4);
            result[(i - 1) * 2 + 1] = bytes1(uint8(nodeKey[i]) & 0x0f);
        }
    }
    return result;
}

Keys for the Receipts Trie

For the receipts trie, the key is not hashed. It is the nibble expansion of the RLP(minimal big‑endian bytes of txIndex):

k := nibbles( RLP( txIndex_bytes_minimal ) ).

Here txIndex_bytes_minimal is the canonical big‑endian representation with no leading zeros. nibbles(b0 b1 … bn) = { high(b0), low(b0), …, high(bn), low(bn) }.

Example -> For transaction index 300:
`txIndex = 300 = 0x012c`
`txIndex_bytes_minimal = [0x01, 0x2c]` (no leading zeros)
`RLP([0x01, 0x2c]) = [0x82, 0x01, 0x2c]` (RLP string encoding)
`nibbles([0x82, 0x01, 0x2c]) = [8, 2, 0, 1, 2, c]` (6 nibbles for MPT traversal)

RLP List Structure Parsing

Before processing any node, the verifier must correctly parse RLP list prefixes:

This prefix parsing is required for block headers, node lists, and log arrays.

function skipItem(bytes calldata data, uint256 offset) internal pure returns (uint256) {
    require(offset < data.length, "RLP offset out of bounds");
    uint8 prefix = uint8(data[offset]);

    if (prefix < 0x80) {
        return offset + 1;  // Single byte
    } else if (prefix < 0xb8) {
        return offset + 1 + (prefix - 0x80);  // Short string
    } else if (prefix < 0xc0) {
        // Long string
        uint256 lengthBytes = prefix - 0xb7;
        uint256 length = 0;
        for (uint256 i = 0; i < lengthBytes; i++) {
            length = (length << 8) | uint8(data[offset + 1 + i]);
        }
        return offset + 1 + lengthBytes + length;
    } else if (prefix < 0xf8) {
        return offset + 1 + (prefix - 0xc0);  // Short list
    } else {
        // Long list
        uint256 lengthBytes = prefix - 0xf7;
        uint256 length = 0;
        for (uint256 i = 0; i < lengthBytes; i++) {
            length = (length << 8) | uint8(data[offset + 1 + i]);
        }
        return offset + 1 + lengthBytes + length;
    }
}

Correctness Conditions (Inclusion Proof)

The verification process combines three layers of security:

  1. Blockchain anchoring: Prove the block header corresponds to a known canonical block
  2. Inclusion: Use MPT to prove the receipt exists in that block's receipts trie
  3. Semantic validation: Parse the receipt to verify it contains the expected Transfer event

This creates an unforgeable proof that a specific ERC-20 transfer occurred in a specific block.

Given:

The verifier accepts iff the following hold:

1. Root Binding

The first node N0 is consistent with R:

function extractReceiptsRoot(bytes calldata blockHeader) internal pure returns (bytes32) {
    uint256 offset = 0;
    
    // Skip RLP list prefix
    require(blockHeader[offset] >= 0xc0, "Invalid RLP list");
    if (blockHeader[offset] >= 0xf8) {
        offset += 1 + (uint8(blockHeader[offset]) - 0xf7);
    } else {
        offset += 1;
    }

    // Skip first 5 fields to get to receiptsRoot (index 5)
    for (uint256 i = 0; i < 5; i++) {
        offset = blockHeader.skipItem(offset);
    }

    // Extract receiptsRoot (32 bytes)
    require(blockHeader[offset] == 0xa0, "Invalid receiptsRoot RLP encoding");
    offset += 1;

    bytes32 receiptsRoot;
    assembly {
        receiptsRoot := calldataload(add(blockHeader.offset, offset))
    }
    return receiptsRoot;
}

2. Path Walk

Starting at N0 with remaining path k, repeat:

function processBranchNode(
    bytes memory node,
    uint256 nodeOffset,
    bytes memory key,
    uint256 keyOffset,
    bytes memory value
) private pure returns (bool success, uint256 newKeyOffset, bytes32 newHash) {
    if (keyOffset >= key.length * 2) {
        // Check value in branch node (at index 16)
        uint256 valueOffset = nodeOffset;
        for (uint256 i = 0; i < 16; i++) {
            valueOffset += node.getItemLength(valueOffset);
        }
        (bytes memory nodeValue,) = node.parseItem(valueOffset);
        if (keccak256(nodeValue) == keccak256(value)) {
            return (true, keyOffset, bytes32(0)); // Found value
        } else {
            return (false, 0, bytes32(0));
        }
    }

    // Navigate to next branch using nibble as index
    uint8 nibble = uint8(key[keyOffset / 2]);
    if (keyOffset % 2 == 0) {
        nibble = nibble >> 4;  // High nibble
    } else {
        nibble = nibble & 0x0f;  // Low nibble
    }

    // Get the branch at this nibble
    uint256 branchOffset = nodeOffset;
    for (uint256 i = 0; i < nibble; i++) {
        branchOffset += node.getItemLength(branchOffset);
    }

    (bytes memory nextHash,) = node.parseItem(branchOffset);
    if (nextHash.length == 0) {
        return (false, 0, bytes32(0)); // Empty branch
    }

    // Resolve child reference
    bytes32 hash;
    if (nextHash.length == 32) {
        assembly { hash := mload(add(nextHash, 32)) }
    } else {
        hash = keccak256(nextHash);
    }

    return (true, keyOffset + 1, hash);  // Advance by one nibble
}

3. Leaf Value Equality

On Leaf, interpret value as a byte string and require value == receipt (exact bytes equality).

If any step fails, the proof is rejected. Otherwise, by collision resistance of keccak256 and the uniqueness of RLP encoding, the receipt is uniquely determined under root R at key k.

Typed vs Legacy Receipts

function validateTransferInReceipt(
    bytes calldata receiptRlp,
    uint256 logIndex,
    address tokenContract,
    address fromAddress,
    address toAddress,
    uint256 expectedAmount
) internal pure returns (bool) {
    uint256 offset = 0;

    // Handle typed receipts (EIP-2718)
    if (receiptRlp.length > 0 && uint8(receiptRlp[0]) < 0x80) {
        offset += 1; // Skip typed receipt prefix
    }

    // Skip RLP list prefix and navigate to logs
    require(uint8(receiptRlp[offset]) >= 0xc0, "Invalid receipt RLP");
    if (uint8(receiptRlp[offset]) >= 0xf8) {
        offset += 1 + (uint8(receiptRlp[offset]) - 0xf7);
    } else {
        offset += 1;
    }

    // Skip status, cumulativeGasUsed, bloom to get to logs
    for (uint256 i = 0; i < 3; i++) {
        offset = receiptRlp.skipItem(offset);
    }
    
    // ... continue to validate specific log
}

How it works

TL;DR

Suppose we want to prove receipt R exists at transaction index 5:
- key k = nibbles of RLP(5) = nibbles([0x05]) = [0, 5]
- Start at root node N0
- If N0 is Branch: use nibble 0 to get child, move to that node, remaining = [5]
- If next node is Branch: use nibble 5 to get child, move there, remaining = []
- If final node is Leaf with value R: SUCCESS!

Setup:

Given:
- receiptsRoot R (32-byte hash from block header)
- proof nodes π = [N0, N1, N2, ..., Nm] (sequence of RLP-encoded nodes)
- key k (nibble sequence representing transaction index)
- receipt bytes r (the receipt we want to prove exists)

Step 1: Verify Root

currentNode := N0
assert keccak256(RLP(currentNode)) == receiptsRoot
remainingPath := k  // copy of full key path

Step 2: Traverse Trie (repeat until done)

while not at leaf:
    decode currentNode from RLP
    
    if currentNode is Branch (has 17 items):
        // Branch nodes: [child0, child1, ..., child15, optionalValue]
        
        if remainingPath is empty:
            // We're at the end of our path - check if this branch stores our value
            if currentNode.value == receipt:
                return SUCCESS
            else:
                return FAIL
        
        // Get next direction: use first nibble of remaining path
        direction := remainingPath.takeFirst()  // 0-15 (hex digit)
        remainingPath := remainingPath.dropFirst()
        
        childReference := currentNode.children[direction]
        
        if childReference is empty:
            return FAIL  // dead end
        
        // Move to child node
        if len(childReference) == 32:
            // It's a hash - find the node that hashes to this value
            currentNode := findNodeWhere(keccak256(RLP(node)) == childReference)
        else:
            // It's inline data - use it directly
            currentNode := RLP_decode(childReference)
    
    else:  // currentNode is Leaf or Extension (has 2 items)
        // KV nodes: [encodedPath, valueOrReference]
        
        (isLeaf, pathSegment) := hex_prefix_decode(currentNode.encodedPath)
        
        // Check if our remaining path starts with this node's path segment
        if not remainingPath.startsWith(pathSegment):
            return FAIL  // path mismatch
        
        // Consume the matching path segment
        remainingPath := remainingPath.dropPrefix(pathSegment)
        
        if isLeaf:
            // This is a leaf - we should be at the end of our path
            if remainingPath is not empty:
                return FAIL  // path too long
            
            if currentNode.value == receipt:
                return SUCCESS  // found it!
            else:
                return FAIL  // wrong value
        
        else:  // Extension node
            // Follow the reference to continue traversal
            childReference := currentNode.reference
            
            if len(childReference) == 32:
                currentNode := findNodeWhere(keccak256(RLP(node)) == childReference)
            else:
                currentNode := RLP_decode(childReference)

Why Anchoring to blockhash Works

Let H be the provided RLP header. We check keccak256(H) == blockhash(n) and that extractBlockNumber(H) == n. This guarantees that R = extractReceiptsRoot(H) comes from a canonical block within the EVM's 256‑block lookback. The MPT proof then shows receipt is committed under R. Together these imply the receipt existed in block n.

Why 256 blocks? The EVM's blockhash(n) opcode only provides access to the most recent 256 block hashes. Beyond this window, blockhash(n) returns 0, making verification impossible. This creates a natural "freshness" requirement for proofs.

The receipts root R is located at index 5 in the block header structure: [parentHash, sha3Uncles, miner, stateRoot, transactionsRoot, receiptsRoot, logsBloom, difficulty, number, ...]