Fuxing Loh

About
Jun 2, 2024(a month ago)

Bitcoin, Ethereum Layers, and then Chainfile: A 4-Step Plan to Simplify and Accelerate Blockchain Adoption

This is a write-up of the Chainfile project, a side project I’ve been developing over the past few months. Leveraging my experience as a software engineer and reflecting on my passion for advancing blockchain technology and promoting its adoption.

While widespread blockchain adoption is becoming inevitable, the current state is far from the early pioneers’ vision. We are chipping away at the core tenets of decentralization, transparency, and security by relying on centralized services, opaque systems, and insecure practices, all in the name of convenience. While convenience has boomed the industry, we are far from the utopia that was envisioned. Attempting to shoehorn convenience into the blockchain space without sacrificing the core tenets of the technology may seem impossible. Still, with the proper execution, resources, and benevolent actors, we might just get there.

Bitcoin early days

Let’s rewind the clock to the early days of Bitcoin when a single binary called bitcoind was all that was needed to participate in the network. It was simple, self-contained, and easy to grasp. Having access to the binary meant you could receive, send, and validate Bitcoin transactions. And that was it—you were part of the network. There was little mention of wallets, miners, indexers, explorers, or APIs; those came later.

That simple whitepaper: Bitcoin: A Peer-to-Peer Electronic Cash System from that unknown inventor(s) is arguably the most important literature that started this revolution.

Designed to be backward-compatible and still is to this day, your bitcoind from 2009 can still perform the same functions (or some of them) as it did back then but without the optimizations and improvements that have been made. In my opinion, Bitcoin’s dogmatic nature makes it beautiful, and the community that worked relentlessly to keep it that way is what makes it unique. Albeit certain actors with their own agendas often steer the narrative in a different direction for their gain, various forks have emerged. Still, the protocol has remained almost pure in its original form.

Simplicity = Democratization

The bitcoin era was defined by its simplicity, which democratized access to the network. You can still, in some fashion, join in, validate the network, and participate in the consensus with just four commands.

wget https://bitcoincore.org/bin/bitcoin-core-27.0/bitcoin-27.0.tar.gz
tar -xvf bitcoin-27.0.tar.gz
bitcoind -daemon
bitcoin-cli sendtoaddress bc1... 1.234

Even though 90% to 99% of the network is now run by custodians through third-party providers, the simplicity largely remains intact for developers. Using commodity hardware and a few more commands, you can rig together a setup to start accepting payments without engaging or entrusting a third-party provider other than the provided binary:
zero trust and all that buzz.

Connectivity = Democratization

With increasing adoption, the network grew; the number of transactions increased, and more demanding setups naturally emerged. Mining on specialized hardware, air-gapped wallets, light and lite wallets, and indexers to keep track of the ever-growing UTXO set. The once-simple Bitcoin node was no longer sufficient to cater to the growing needs of the participants, convenience and having a competitive edge are now the name of the game.

However, one thing that remained constant was the ever-growing need for connectivity. You can offload the mining, private keys, and UTXO tracking, but you can't offload the consensus. Every network action is still traced back to calling an RPC method on the Bitcoin node. You can cache, sign, and submit transactions or blocks separately, but the node is the only source of truth. Only the Bitcoin node can validate the transactions and blocks, and your balance is only as good as the node you trust.

Naturally, one node setup is no longer sufficient during this adoption phase. HA, RTO, RPO, and all the other fun DevOps acronyms emerged. The blockchain infrastructure team emerged to manage the growing complexity of the network(s) and build customized and opinionated setups to cater to their organization’s growing needs.

Simultaneously, innovations are pushing the boundaries of blockchain, changing how we run nodes. Emerging networks increase the number of nodes we have to run while juggling the maintenance of different protocols and their hard fork schedules. Or, you could just outsource it to a third-party provider. Some did, but some can't due to compliance, security, jurisdictional, and other reasons. (More about this later on, such as why the third-party provider isn't the best solution for the future of blockchain.) As much as we hate it, laws and regulations still govern how we operate. And the toughest ones are within the finance sector which blockchain is disrupting.

The Ethereum boom

Ethereum’s introduction of smart contracts brought a transformative wave of innovation to the blockchain space, significantly increasing the complexity of running blockchain nodes. Unlike Bitcoin, which primarily functions as a decentralized ledger, Ethereum transforms the blockchain into a decentralized computer capable of executing complex code. This shift enabled the creation of decentralized applications (dApps) that required nodes to handle more than just simple transactions; they now had to process and validate smart contract operations.

The increased functionality came with substantial challenges. Running an Ethereum node became far more complex due to the need to handle faster block times, higher transaction volumes, and more intricate (or massive) state management. The data requirements for nodes skyrocketed from gigabytes to terabytes, necessitating advanced infrastructure and robust storage solutions. Additionally, maintaining an Ethereum node required continuous updates and configurations to keep up with the network’s rapid evolution and increasing demands.

Organizations, particularly exchanges, faced a flood of requests to integrate Ethereum and its associated technologies. This pressure often forced them to either neglect these innovations or struggle to keep up with the frequent upgrades and new developments. Managing nodes became a significant operational challenge, involving setting up sidecar services to keep track of the network state and transactions.

Despite these hurdles, Ethereum’s (and Ethereum-like) potential has driven many organizations to invest heavily in their infrastructure to remain competitive. But some throw in the towel and outsource the entire node management to third-party providers, killing the protocol’s decentralization dream.

The Shift from PoW to PoS

2022 marked a significant milestone for Ethereum with the transition from Proof of Work (PoW) to Proof of Stake (PoS). This event, known as “the merge,” signified a fundamental change in how the Ethereum network achieved consensus. PoW, which relies on computational power to solve complex mathematical problems, was replaced by PoS, where validators are chosen based on the cryptocurrency they hold and are willing to stake as collateral.

The merge was monumental. It promised to reduce Ethereum’s energy consumption by over 99%, addressing a major criticism of blockchain technology. It also aimed to improve scalability and security, making the network more efficient and accessible. However, this transition introduced new complexities. Running a PoS node required different configurations and a deeper understanding of a two-client system (consensus and execution).

Unfortunately, this increased complexity has impacted the network’s democratization. Running a PoS node effectively now requires more sophisticated setups, demanding greater technical knowledge and resources. As a result, many individuals and smaller entities find it challenging to become validators independently. This barrier has led to a reliance on centralized services to stake their ETH, as these services can manage the technical and operational demands on behalf of users. Consequently, the decentralization ethos of Ethereum faces challenges as the network sees a concentration of staked ETH within a few large service providers, reducing overall decentralization.

The advent of more innovations, such as sharding and rollups, will undoubtedly add new dimensions to the complexity of running a node. These advancements, while necessary for scaling and enhancing the network, will require even more sophisticated infrastructure and tools, further increasing the operational challenges for organizations and individuals alike. This ongoing evolution underscores the need for solutions that can lower the barrier to entry for validators, preserving the decentralized nature of the Ethereum network.

Destined for centralization?

Currently, node providers are the go-to solution for managing the intricate process of running a blockchain node. They cater to enterprise requirements, ensuring high availability and redundancy. However, this reliance on a few providers raises concerns about centralization and the potential risks it brings, such as single points of failure and a preference for specific clients, which reduces the diversity of the network.

With most blockchain protocols (except those of Bitcoin and Ethereum) today, users rely on RPC services provided by the foundations or core teams as the primary means of interacting with the network. This reliance on centralized services undermines the decentralized ethos of blockchain technology, as it essentially becomes a centralized service if the foundation controls the nodes.

To mitigate these concerns and maintain decentralization, the community must prioritize developing tools and frameworks that simplify node operation. Lowering the barrier to entry involves offering better documentation, user-friendly interfaces, and robust support for third-party providers. The blockchain ecosystem can reduce dependency on centralized services by making it easier for diverse participants to run their own nodes. Or paradoxically, lowering the barrier to entry can create more centralization points and, therefore, achieve more decentralization.

Chainfile: the 4-step plan

Here, I lay out a 4-step ecosystem plan to restore the simplicity of running a blockchain node and return the democratization of access to the blockchain, regardless of scale, complexity, or tenancy, to accelerate the adoption of blockchain technology regardless of the protocol.

Simply put, Chainfile is an open-source ecosystem that defines, tests, deploys, and scales blockchain nodes on container orchestration platforms. While utilizing containers is necessary to take advantage of innovations in container orchestration, it is a detail that upstream consumers do not have to worry about.

The 4-step ecosystem plan of Chainfile:

  1. Define once using a simple, non-turing-complete JSON schema that is easy to compose, maintain, and upgrade.
{
  $schema: 'https://chainfile.org/schema.json',
  caip2: 'bip122:000000000019d6689c085ae165831e93',
  name: 'Bitcoin Mainnet',
  containers: {
    //...
  },
}
  1. Test locally using containers to ensure your blockchain application integrates well with the rest of your stack.
import definition from '@chainfile/hardhat/localhost.json';

const testcontainers = await ChainfileTestcontainers.start(definition);
  1. Deploy anywhere using Docker Compose, Kubernetes, or any other container-orchestration platform.
chainfile-docker synth @chainfile/bitcoin/bitcoind.json
docker compose up
  1. Scale effortlessly with planet-scale cloud constructs that are future-proof to run on decentralized computing when ready.
import { ChainfileChart } from 'chainfile-cdk8s';
import mainnet from '@chainfile/geth-lighthouse/mainnet.json';

class App extends Stack {
  constructor(scope: App, id: string, props?: StackProps) {
    super(scope, id, props);

    new ChainfileChart(this, 'geth-lighthouse', {
      definition: mainnet,
      spec: { replicas: 1 },
    });
  }
}

Chainfile: The anatomy

The starting point of any blockchain in Chainfile is the definition. It is a simple JSON schema that defines the characteristics of running a blockchain node. Making it non-turing-complete forces the definition to be self-contained, easy to compose, maintain, and improve upon.

The definition acts as a blueprint for the rest of the ecosystem to reliably extend its abstraction from. Think NPM or chainlist, but for node orchestration. It is a place where you can find, share, and improve upon the definitions of the blockchain nodes you want to run. Pull them in to test, deploy, and scale your blockchain application.

Below is an example of a bare-bone definition featuring a Hardhat node for local orchestration, which is useful for running integration.

{
  "$schema": "https://chainfile.org/schema.json",
  "id": "eip155:31337/hardhat",
  "caip2": "eip155:31337",
  "name": "Hardhat",
  "containers": {
    "hardhat": {
      "image": "ghcr.io/fuxingloh/hardhat-container:2.22.1",
      "source": "https://github.com/fuxingloh/hardhat-container",
      "resources": {
        "cpu": 0.25,
        "memory": 256
      },
      "endpoints": {
        "rpc": {
          "port": 8545,
          "protocol": "HTTP JSON-RPC 2.0",
          "probes": {
            "readiness": {
              "params": [],
              "method": "eth_blockNumber",
              "match": {
                "result": {
                  "type": "string"
                }
              }
            }
          }
        }
      }
    }
  }
}

Here is a more complex definition for Bitcoin Mainnet using bitcoind. Examples of multi-containers can be found in the Chainfile repository.

{
  "$schema": "https://chainfile.org/schema.json",
  "caip2": "bip122:000000000019d6689c085ae165831e93",
  "name": "Bitcoin Mainnet",
  "params": {
    "rpc_user": {
      "description": "Username for Basic HTTP authentication to the RPC server.",
      "secret": true,
      "default": {
        "random": {
          "bytes": 16,
          "encoding": "hex"
        }
      }
    },
    "rpc_password": {
      "description": "Password for Basic HTTP authentication to the RPC server.",
      "secret": true,
      "default": {
        "random": {
          "bytes": 16,
          "encoding": "hex"
        }
      }
    }
  },
  "volumes": {
    "data": {
      "type": "persistent",
      "size": "600Gi",
      "expansion": {
        "startFrom": "2024-01-01",
        "monthlyRate": "20Gi"
      }
    }
  },
  "containers": {
    "bitcoind": {
      "image": "docker.io/kylemanna/bitcoind",
      "tag": "latest",
      "source": "https://github.com/kylemanna/docker-bitcoind",
      "endpoints": {
        "p2p": {
          "port": 8333
        },
        "rpc": {
          "port": 8332,
          "protocol": "HTTP JSON-RPC 2.0",
          "authorization": {
            "type": "HttpBasic",
            "username": {
              "$param": "rpc_user"
            },
            "password": {
              "$param": "rpc_password"
            }
          },
          "probes": {
            "readiness": {
              "method": "getblockchaininfo",
              "params": [],
              "match": {
                "result": {
                  "type": "object",
                  "properties": {
                    "blocks": {
                      "type": "number"
                    }
                  },
                  "required": ["blocks"]
                }
              }
            }
          }
        }
      },
      "resources": {
        "cpu": 1,
        "memory": 2048
      },
      "environment": {
        "DISABLEWALLET": "1",
        "RPCUSER": {
          "$param": "rpc_user"
        },
        "RPCPASSWORD": {
          "$param": "rpc_password"
        }
      },
      "mounts": [
        {
          "volume": "data",
          "mountPath": "/bitcoin/.bitcoin"
        }
      ]
    }
  }
}

Testing with Chainfile

In my opinion, the most critical missing piece in the blockchain ecosystem is the ability to test. The inherent complexity of blockchain systems, combined with cryptography and consensus mechanisms, makes setting up a test environment a significant challenge. Coordinating forks, syncs, reverts, and state changes across multiple nodes is a nightmare. Bundle that with cross-protocol contracts, and you have a convenient excuse not to test at all.

Chainfile shines the light on this dark corner of the blockchain ecosystem. Regardless of the protocol, you have a familiar interface to download testing nodes (localhost, 1337, regtest) for your integration testing. This gives us a standardized tool to expand reliably without having to understand the underlying intricacies of the protocol.

Through this lowered barrier to simulate the network and contract interactions, testing becomes less of a chore and instead a developer’s first step to understand how things work and integrate. Education is the key to adoption; with Chainfile, you make it easy to debug, introspect, and repeatably run your scenarios in a controlled environment. This immediate feedback from tests enables newcomers to quickly understand, identify, and resolve issues before going into production. That is highly essential in a domain where errors are irreversible and can have significant financial implications.

Testcontainers with Hardhat Localhost:

import definition from '@chainfile/hardhat/localhost.json';

it('should rpc(eth_blockNumber)', async () => {
  const testcontainers = await ChainfileTestcontainers.start(definition);
  const hardhat = testcontainers.getContainer('hardhat');
  const response = await hardhat.rpc({ method: 'eth_blockNumber' });

  expect(await response.json()).toMatchObject({
    result: '0x0',
  });
});

Testcontainers with Solana Test Validator:

import definition from '@chainfile/solana/solana-test-validator.json';

it('should rpc(getBlockHeight)', async () => {
  const testcontainers = await ChainfileTestcontainers.start(definition);
  const validator = testcontainers.getContainer('solana-test-validator');
  const response = await validator.rpc({ method: 'getBlockHeight' });

  expect(await response.json()).toMatchObject({
    result: 0,
  });
});

Testcontainers with Ethereum Testnet:

Run your tests against Ethereum Testnet with mounted volume for faster sync.

import testnet from '@chainfile/erigon/testnet.json';

it('should rpc(getblockcount)', async () => {
  const testcontainers = await ChainfileTestcontainers.start(testnet);
  const erigon = testcontainers.getContainer('erigon');
  const response = await erigon.rpc({ method: 'getblockcount' });

  expect(await response.json()).toMatchObject({
    result: expect.any(Number),
  });
});

Testcontainers with Bitcoin Mainnet:

You can even run mainnet nodes for integration testing, but why would you?

import mainnet from '@chainfile/bitcoincore/mainnet.json';

it('should rpc(getblockcount)', async () => {
  const testcontainers = await ChainfileTestcontainers.start(mainnet);
  const bitcoind = testcontainers.getContainer('bitcoind');
  const response = await bitcoind.rpc({ method: 'getblockcount' });

  expect(await response.json()).toMatchObject({
    result: 0,
  });
});

Or even a Hash-Time Locked Contract (HTLC) between Bitcoin and Ethereum, which is incredibly hard to model and test due to multiple factors such as block time, network latency, and game theory.

import bitcoin from '@chainfile/bitcoincore/regtest.json';
import hardhat from '@chainfile/hardhat/localhost.json';

it('should htlc between bitcoin and ethereum', async () => {
  const btc = await ChainfileTestcontainers.start(bitcoin);
  const eth = await ChainfileTestcontainers.start(hardhat);
  const bitcoind = btc.getContainer('bitcoind');
  const hardhat = eth.getContainer('hardhat');
  // Test HTLC...
});

Why is it so important to test this? Because coordination with trustless systems is at another level of complexity. Take this simple scenario I created:

  1. Jane: Set up a dApp on Ethereum that allows users to perform an atomic swap of ETH to BTC without a third party.
  2. John: Create an Offer On ETH using a Smart Contract created by Jane (ETH -> BTC Swap) - eth_sendTransaction
  3. Alex: Accept Offer on ETH locking up John ETH - eth_sendTransaction
  4. Alex: Submit the HTLC on BTC that expires in 12 blocks. - btc_sendtoaddress
  5. John: Claim BTC, therefore revealing the secret to Alex to claim his ETH. - btc_sendrawtransaction
  6. Alex: Claim ETH using the secret revealed by the claim transaction. - eth_sendTransaction

What if?

  • John doesn't claim the BTC? Can Alex claim back his ETH?
  • Alex doesn't claim the ETH? Can John claim his BTC back?
  • John claims the BTC revealed the secret, but Alex got hit by a power outage.
  • John claims the BTC, but Alex got hit by a DDOS attack, making it incredibly difficult to claim the ETH.
  • What if you can mock all these scenarios in a test environment?

How do you get started with Chainfile?

While Chainfile is still in its early days, the 4-step is already in place. Get started by checking out the repository on github.com/fuxingloh/chainfile and taking it for a spin.