How to Access Pre-Chain Data
Because public blockchains are just that—’public’—anyone can scroll back in time and find every transaction ever made. This is typically done through the use of a block explorer or you can spin up your own node and download the entire history of the blockchain’s public ledger and eventually start processing new transactions yourself. Public blockchains (and their transaction history) are accessible to every network participant.
This, however, is NOT true for pre-chain data flowing through the pending transaction pools—aka the mempool. These pre-chain events include key data points such as how long the transaction has been pending, or whether it has been rejected, evicted, sped-up, canceled, failed, dropped, stuck, or confirmed.
The pre-chain layer consists of ephemeral data that is only temporarily held by a node until the transaction is either confirmed on-chain or evicted from your view. Once evicted, any record of that data ever existing is gone. The only way to access this data again is if it has been archived.
At Blocknative, we recognized that this fleeting pre-chain data has a lot of rich, beneficial information that could be harnessed by both builders and traders. For this reason, our global pre-chain streaming infrastructure has been recording and archiving mempool data for several years. In this blog, we will cover what a Pre-Chain Archive is, how to access it, and how traders and Web3 developers can harness the data for backtesting their strategies, training their models, or improving their protocols.
Many Pre-Chain Transactions Do Not Make It On-Chain
Amazingly, many transactions that appear pre-chain do not actually make it on-chain. For example, in March 2022, Blocknative found that roughly 16% of transactions sent to Ethereum's pending transaction pools (nearly 1 out of every 6 transactions) did not make it on-chain. In the following month, April 2022, even with the drop in trading volume about 12% of transactions (a little less than 1 out of every 8 transactions) still did not make it on-chain!
There are many reasons a transaction may not make it on-chain. This includes everything from issues with a sender’s gas price, nonce value, or ETH balance, to a speed-up or cancel transaction replacing the original or just overall mempool congestion. In isolation, these events seem innocent enough, but, when you dig in and look for patterns of activity, you discover a whole world constantly battling for blockspace.
The pre-chain is often a highly adversarial period in time, and how the battles within it are won and lost play a critical role in both the development and testing of dynamic trading strategies, as well as the improvement of network security. However, because most pre-chain activity never makes it on-chain, most of the key data related to these battles is inaccessible to those just looking at public, on-chain data.
Using Archived Pre-Chain Data: the Blocknative Advantage
Since 2019, Blocknative has collected years' worth of data on Ethereum mainnet, equating to over 100 billion rows of events, over a year’s worth of data and 16+ billion rows of events on BNB Chain. In late 2021, we started archiving data on both Polygon and Fantom, which, collectively, already have over 5 billion rows of events.
To ensure we see and monitor transactions from inception through to confirmation (or failure), Blocknative runs a robust infrastructure stack. On Ethereum this includes numerous client types, including GETH, OpenEthereum, and Nethermind nodes in a variety of configurations.
Additionally, we run both stock and custom implementations to get the most comprehensive view and finest resolution of the pre-chain layer as possible. This allows us to capture a variety of different event types and data points in our archive, including all pending, rejection, eviction, speedup, cancel, failed, dropped, stuck and confirmed events.
Our infrastructure is globally distributed and includes regional data on where and when each event was first seen. And finally, our timestamps are at millisecond resolution and synced via the atomic clock service from our cloud provider (a significant improvement to the Network Time Protocol).
Access Our Pre-Chain Archive Today
Using our Pre-Chain Archive data, we at Blocknative have worked with builders and traders to better understand the inner workings of the dark forest.
- Builders can analyze their own protocol to understand gas prices, time to confirmation, time pending, failed transactions, MEV-related transactions, or replay attacks and adversarial conditions to test infrastructure upgrades.
- Researchers can do a variety of deep dives on topics like understanding the intricacies of the gas market, analyzing the P2P network, or uncovering the motivations of different bot activities.
- And traders can analyze user behavior during high-volatility times, create and discover novel long-tail MEV strategies or implement machine learning models for their trading strategies.
To see the full schema for our data set, you can download the PDF here.
If you want to learn more about how you can use our Pre-chain Archive data set, please reach out to our sales team and we will be happy to assist you.
Master the Mempool today.
Blocknative's proven & powerful enterprise-grade infrastructure makes it easy for builders and traders to work with mempool data.Start for free