fproof

command module
v0.0.0-...-d41face Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 1, 2026 License: MIT Imports: 1 Imported by: 0

README

fproof

Notarize documents on the Ethereum blockchain.

fproof is a CLI tool that notarizes S3 objects on the Ethereum blockchain. For each object it stores a digital fingerprint, which can't be modified anymore and serves as proof of the original document. If you need to prove that documents haven't been modified since their original storage — fproof gives you that guarantee backed by the immutability and transparency of blockchains.

Storage on blockchain is very expensive and transaction fees are volatile. You can't store the documents on blockchain directly. Instead, fproof uses blockchain as a notary service, storing proofs of documents. Merkle trees enable compressing all digital fingerprints into a single root hash, so the number of input documents doesn't affect the number of transactions. The root hash is sent as a transaction to blockchain, while the Merkle proof of each individual hash is stored as Amazon S3 metadata with the document. That way, the proof can never be separated from the document itself — you can retrieve it by querying the object's metadata.

Verification can then be done by recomputing the branch of the Merkle tree. If the original document is still the same as during Merkle tree creation, the verification step results in the same root hash. With the root hash retrieved from the blockchain, fproof proves two things: the original document was part of the Merkle tree at its original creation, and the document existed when the root hash was stored on blockchain.

Based on the approach described in Notarize documents on the Ethereum Blockchain.

Installation

Requires Go 1.25+.

go install github.com/eerzho/fproof@latest

Configuration

Create a .fproof.yaml file or pass flags directly. Every config field can be overridden with a CLI flag.

s3:
  endpoint: https://s3.amazonaws.com
  access-key: YOUR_ACCESS_KEY
  secret-key: YOUR_SECRET_KEY
  bucket: your-bucket
  prefix: documents/
  region: us-east-1
  use-path-style: false
eth:
  rpc-url: https://mainnet.infura.io/v3/YOUR_PROJECT_ID
  private-key: YOUR_HEX_PRIVATE_KEY
  chain-id: 1
Flag Description Default
--config Path to YAML config file .fproof.yaml
--s3-endpoint S3-compatible endpoint URL -
--s3-access-key S3 access key ID -
--s3-secret-key S3 secret access key -
--s3-bucket Target S3 bucket name -
--s3-prefix S3 object key prefix -
--s3-region S3 region us-east-1
--s3-use-path-style Use path-style S3 addressing false
--eth-rpc-url Ethereum JSON-RPC endpoint -
--eth-private-key Hex-encoded private key for signing -
--eth-chain-id Ethereum chain ID -

Commands

fproof commit

Hash S3 objects and anchor Merkle roots to Ethereum.

fproof commit --config .fproof_example.yaml --prefix 100mb

The commit pipeline takes all objects from Amazon S3 and hashes them, aggregates the individual hashes into a Merkle tree, sends the root hash as a transaction to blockchain, and stores the Merkle proof of each individual hash as Amazon S3 metadata with the document:

  1. Gets a list of all objects in Amazon S3 with a specific prefix, grouped into chunks (--chunk-size, default 1000)
  2. For each object, retrieves it from Amazon S3 and generates its SHA-256 hash in parallel (--concurrency, default 5)
  3. Creates the Merkle tree as a pairwise hash tree — the hashes form the tree's leaves, then builds the tree bottom-up until one root hash remains
  4. Sends the root hash as a transaction to blockchain (zero-value self-transaction with root as calldata)
  5. Stores the Merkle proof of each individual hash as Amazon S3 metadata with the document (fproof-tx-id, fproof-root, fproof-path, fproof-siblings)
Flag Description Default
-p, --prefix S3 key prefix filter -
-c, --concurrency Max parallel S3 operations 5
-s, --chunk-size Objects per Merkle tree chunk 1000
fproof verify

Verify an S3 object's proof against the blockchain.

fproof verify --config .fproof_example.yaml --key 100mb/file_001.bin

Verification is fairly simple computation. It only requires a sequence of hash operations, which is bound by the height of the tree:

  1. Retrieves the object from Amazon S3 and generates its SHA-256 hash
  2. Retrieves the proof — the Merkle proof stored as Amazon S3 metadata with the document (fproof-tx-id, fproof-root, fproof-path, fproof-siblings)
  3. Recomputes the branch of the Merkle tree by doing the pairwise hashing from the leaf to the root using the proof hashes
  4. Retrieves the root hash from the blockchain transaction and checks if the calculated root matches the one retrieved from the blockchain

If the calculated root matches the one retrieved from the blockchain, it proves that the original document was part of the Merkle tree at its original creation and that the document existed when the root hash was stored on blockchain. If any step fails — the document was modified, metadata was tampered with, or the on-chain root doesn't match — verification fails.

Flag Description Default
-k, --key S3 object key to verify -

How Merkle Trees Work

Merkle trees are very useful to prove that a particular data point is part of a data structure. fproof stores the proofs in a Merkle tree data structure, which aggregates many hashes (the leaves of the tree) into one so-called root hash. The tree has all the proofs for the documents as its leaves. Bottom up, we hash the proofs pairwise until we end up with one hash only, which forms the root of the tree.

Given n objects with hashes h₁, h₂, …, hₙ, the tree is constructed bottom-up:

        root = H(h₁₂ ‖ h₃₄)
           /              \
    h₁₂ = H(h₁ ‖ h₂)    h₃₄ = H(h₃ ‖ h₄)
       /      \              /      \
    H(h₁)     H(h₂)        H(h₃)    H(h₄)

Where H is SHA-256 and denotes concatenation.

We can verify the existence of a specific document. We need two additional data points: first, the so-called proof (sibling hashes) for an element — the hashes to do the pairwise hashing without recreating the entire tree each time. Second, the actual root hash, which we can retrieve from the blockchain.

Formally: given a leaf hash hᵢ, a path direction vector p ∈ {0, 1}^d (where d = ⌈log₂(n)⌉), and sibling hashes s₁, s₂, …, s_d:

v₀ = hᵢ
vⱼ = H(vⱼ₋₁ ‖ sⱼ)  if pⱼ = 0
vⱼ = H(sⱼ ‖ vⱼ₋₁)  if pⱼ = 1

The proof is valid iff v_d = R (the root hash from blockchain).

With this approach:

  • Compression: with x elements in the tree, we only need log(x) hash operations for verification. With 1,000,000 objects, that's ~20 hashes instead of 1,000,000. The number of input documents doesn't affect the number of transactions.
  • Tamper detection: changing even a single bit in any object produces a completely different root hash. If the original document is still the same as during Merkle tree creation, the verification step results in the same root hash. Forging a valid proof would require finding a SHA-256 collision (~2¹²⁸ operations), which is computationally infeasible.
  • Independent verification: anyone with the object, proof metadata, and access to the blockchain can independently verify integrity — no trusted third party required.

Why Store on Blockchain

Although blockchain can store data immutably, it's very restricted on the amount of data. Each byte stored on blockchain is fairly expensive. The high transaction fees and volatility in those fees lead to two insights: we can't store the documents on blockchain directly, and even storing proofs only for every document individually is too expensive. Instead, we have to compress many proofs into one transaction so that we can reduce the number of transactions drastically.

fproof solves this by storing only the root hash of each Merkle tree on blockchain. The root hash is sent as the calldata of a zero-value self-transaction (sending 0 ETH to your own address). This costs close to the minimum possible gas on Ethereum. Because fproof batches objects into Merkle trees (up to 1,000 per chunk by default), the per-object cost is negligible. Transaction cost remains manageable, because it depends on the chunk size only and not on the number of documents.

Due to the immutability and transparency of blockchains, they can be a useful tool for notarizing documents:

  • Immutability: blockchains can store values immutably so that they can be audited at a later point. Once a transaction is included in a block, the root hash stored in the transaction's calldata cannot be altered or deleted. Ethereum's consensus mechanism with thousands of independent validators makes retroactive changes practically impossible.
  • Timestamping: each block has a timestamp agreed upon by the network's validators. When fproof stores a root hash on blockchain, this serves as proof that the document existed when the root hash was stored on blockchain — a stronger guarantee than a traditional trusted third party because it doesn't rely on a single entity's honesty.
  • Public verifiability: anyone can independently query any transaction and read the calldata. Verification requires no special permissions or trust relationships. You can verify a proof using any Ethereum node, any block explorer, or fproof itself.
  • No smart contract required: fproof uses simple self-transactions rather than deploying a smart contract. This minimizes gas costs and eliminates smart contract risk. The data is stored in the transaction itself — permanently available through any Ethereum node.

Differences from the AWS Approach

The AWS approach described in Notarize documents on the Ethereum Blockchain deploys a smart contract with storeNewRootHash and verify functions. fproof takes a different path that is simpler, cheaper, and has a smaller attack surface.

No smart contract. The AWS solution stores the root hash by calling a smart contract function that emits an event. A smart contract is code on blockchain — it can contain bugs, it can be deployed as upgradeable (allowing the owner to change verification logic after the fact), and it introduces attack surface (reentrancy, access control errors, etc.). fproof stores the root hash directly in the transaction's calldata as raw bytes. There is no code on blockchain, nothing to exploit, nothing that can be upgraded. The data in a transaction is immutable by definition — not by the correctness of a contract, but by the protocol itself.

Off-chain verification. The AWS smart contract has an on-chain verify function that recomputes the Merkle branch inside the EVM. This is convenient (anyone can call it without installing software), but every verification costs gas and relies on the contract code being correct. fproof verifies entirely off-chain: it reads tx.Data from any Ethereum node, recomputes the branch locally, and compares the root. Verification is free, can be done offline once the transaction data is fetched, and doesn't depend on any deployed code.

Lower gas costs. A smart contract deployment costs hundreds of thousands of gas. Each storeNewRootHash call costs more than a simple transaction because of contract execution overhead and event emission. fproof uses a simple self-transaction with the root hash as calldata, which costs close to the minimum possible on Ethereum.

Local Development

Start MinIO and Anvil via Docker Compose:

task up

Build the binary:

task build

Run locally:

./fproof commit --config .fproof_example.yaml --prefix 100mb
./fproof verify --config .fproof_example.yaml --key 100mb/file_001.bin

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis
internal
pkg
eth
mt
s3

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL