Geth Source Code Series: Transaction Design and Implementation

CN
11 hours ago

Author: Ray

This article is the fourth in the Geth source code analysis series, which will penetrate the surface process and delve into every aspect of the transaction lifecycle: from signature verification to the dual-layer architecture of the transaction pool (the design essentials of BlobPool and LegacyPool), from the dynamic fee mechanism (EIP-1559) to the load balancing implementation (the elastic adjustment logic of BaseFee), from the P2P propagation optimization of broadcasting and packaging to the underlying state modification during EVM execution (dirty state tracking of StateDB).

We will analyze the key implementations in the source code layer by layer, combining the latest features of the Pectra upgrade (such as SetCodeTxType transactions):

How does the TxData interface unify five types of transactions?

How does LazyTransaction in the transaction pool balance memory safety and efficiency?

How do the consensus layer and execution layer collaboratively build blocks through engineAPI?

Why do local transactions enjoy packaging priority?

Whether you want to understand the eviction mechanism of the transaction pool against DDoS attacks or explore how ApplyTransaction drives the state machine, this article will provide you with a complete framework for a source-level perspective on Ethereum's transaction engine.

01 Introduction to Transactions

The Ethereum execution layer can be seen as a transaction-driven state machine, where transactions are the only way to modify the state. Transactions can only be initiated by EOAs and will include a signature from the private key. After execution, the state of the Ethereum network is updated. The simplest transaction in the Ethereum network is the transfer of ETH from one account to another.

It should be noted that as EOAs can support contract capabilities through the EIP-7702 upgrade, the concepts of contract accounts and EOAs will gradually blur. However, in the current version, it is still considered that only EOAs controlled by private keys can initiate transactions.

Ethereum supports different types of transactions. When the Ethereum mainnet was first launched, it only supported one type of transaction. Subsequently, as Ethereum continued to upgrade, various types of transactions were gradually supported. The current mainstream transaction type supports dynamic fee EIP-1559 transactions, which most users submit. EIP-4844 introduced a cheaper data storage solution for Layer 2 or other off-chain scaling solutions, while the latest Pectra upgrade introduced a transaction form that can extend EOAs into contracts through EIP-7702.

As Ethereum develops, it may support other transaction types in the future, but the overall processing flow of transactions will not change significantly; they will still need to go through the process of transaction submission → transaction verification → entering the transaction pool → transaction propagation → packaging into blocks.

02 Evolution of Transaction Structure

Since the launch of the Ethereum mainnet, the transaction structure has undergone four major changes, laying a foundation for security and scalability, allowing Ethereum to add transaction types at a low cost.

Preventing Cross-Chain Replay Attacks

The initial transaction structure is as follows, where the RLP-encoded transaction data is spread and processed:

  • RLP([nonce, gasPrice, gasLimit, to, value, data, v, r, s])

The biggest problem with this structure is that it is not associated with a chain; transactions generated on the mainnet could be executed on other chains at will. Therefore, EIP-155 embedded the chainId (e.g., mainnet ID=1) in the signature v value to isolate transactions across different chains, ensuring that transactions on each chain cannot be replayed on other chains.

Involved EIP:

  • EIP-155

Standardization of Transaction Expansion

As Ethereum developed, the initial transaction format could no longer meet the needs of certain scenarios, necessitating the addition of new transaction types. However, if transaction types are added arbitrarily, it may lead to management complexity and a lack of standardization. EIP-2718 defined the format for subsequent transactions, primarily defining the TransactionType || TransactionPayload structure, where:

  • TransactionType defines the class of transaction types, allowing for up to 128 types, which is sufficient for new transaction types.

  • TransactionPayload defines the data format of the transaction, currently using RLP for data encoding, but it may be upgraded to SSZ or other encodings in the future.

This upgrade was completed in the Berlin upgrade. In addition to EIP-2718, EIP-2930 introduced the Access List transaction type, which allows users to pre-declare the contracts and storage they need to access in the transaction, reducing gas consumption during transaction execution.

Involved EIPs:

  • EIP-2718

  • EIP-2930

Transformation of Ethereum's Economic Model

In the London upgrade, EIP-1559 introduced the Base Fee mechanism, slowing down or even deflating the issuance of ETH. For nodes participating in staking, there is also the possibility of earning additional income through tips (maxPriorityFeePerGas). EIP-1559 transactions inherit the Access List mechanism, which is currently the most common transaction type. Furthermore, after the Paris Merge upgrade, Ethereum transitioned from PoW to PoS, rendering the previous mining economic model obsolete and entering the Staking era.

Additionally, EIP-1559 introduced a target mechanism that can dynamically adjust the Base Fee, effectively introducing load balancing capabilities to Ethereum. The target value is half of the block Gas Limit; if it exceeds this value, the Base Fee will continue to rise, allowing many transactions to avoid congestion periods, thereby alleviating overall chain congestion and improving user experience.

Involved EIP:

  • EIP-1559

Addition of Various Extended Transactions

After EIP-2718 and EIP-1559 defined the standards for extended transactions and the economic model, new transaction types have been gradually added. In the last two upgrades, EIP-4844 and EIP-7702 were introduced. The former added the Blob transaction type, providing an ideal storage solution for off-chain scaling, with large space and low cost, and also features an economic model and load mechanism similar to EIP-1559. EIP-7702 allows EOAs to be transformed into smart contract accounts controlled by private keys, preparing for the large-scale adoption of account abstraction in the future.

Involved EIPs:

  • EIP-4844

  • EIP-7702

03 Transaction Module Architecture

As the input to the Ethereum state machine, almost all main processes revolve around transactions. Before entering the transaction pool, transactions need to be verified for format and signature information. After entering the transaction pool, they need to be propagated among different nodes, then selected by block-producing nodes from the transaction pool, executed on the EVM, and modify the state database, finally being packaged into blocks and transmitted between the execution layer and the consensus layer.

For block-producing nodes and non-block-producing nodes, the transaction processing flow will differ slightly. Block-producing nodes are responsible for selecting transactions from the transaction pool, packaging them into blocks, and updating the local state database. Non-block-producing nodes only need to re-execute the transactions in the latest synchronized block to update their local state.

Types of Transactions

Currently, Ethereum supports a total of five types of transactions. The main structures of these transactions are similar, and different transactions are distinguished by the transaction type field, with specific purposes achieved through some extended fields.

  • LegacyTxType: The basic format inherited from the genesis block, using the first-price auction model (users manually set gasPrice). After the EIP-155 upgrade, it defaults to embedding chainId to prevent cross-chain replay attacks. Its usage on the Ethereum mainnet is now relatively low, and Ethereum currently remains compatible with this transaction type, which will gradually be phased out.

  • AccessListTxType: Preheating storage access significantly reduces gas costs. This feature has been inherited by subsequent transaction types, and transactions directly using this type are also relatively few.

  • DynamicFeeTxType: A transaction type that updates Ethereum's economic model, introducing Base Fee and target mechanisms, inheriting the Access List feature. This is currently the most mainstream transaction type.

  • BlobTxType: A transaction type specifically for off-chain scaling, allowing transactions to carry large amounts of low-cost data through a blob structure, reducing the cost of off-chain scaling solutions. It inherits the Access List and Dynamic Fee features, and the blob in the transaction has a separate billing mechanism similar to EIP-1559.

  • SetCodeTxType: Allows EOAs to convert into contract accounts (which can also revoke contract capabilities through transactions) and execute the corresponding contract code in the EOA, inheriting the Access List and Dynamic Fee features.

Transaction Lifecycle

Once a transaction is packaged into a block, it has completed the modification of state data, and it can be understood that the transaction's lifecycle has ended. During this process, a transaction will go through four phases:

  • Transaction Verification: Transactions submitted by EOAs undergo a series of basic verifications before being added to the transaction pool.

  • Transaction Broadcasting: Newly submitted transactions to the transaction pool are broadcasted to the transaction pools of other nodes.

  • Transaction Execution: Block-producing nodes select transactions from the transaction pool for execution.

  • Transaction Packaging: Transactions are packaged into blocks in a certain order (first distinguishing whether they are local transactions, then sorted by gas fee size), ignoring those that fail verification.

Transaction Pool

The transaction pool is a temporary storage place for transactions. Before being packaged, transactions are stored in the transaction pool. Transactions in the transaction pool are synchronized to other nodes and also synchronize transactions from the transaction pools of other nodes. User-submitted transactions first enter the transaction pool, then trigger the consensus process through the consensus layer, driving transaction execution and packaging into blocks.

Currently, there are two types of implementations for the transaction pool:

  • Blob Transaction Pool (Blob TxPool)

  • Other Transaction Pool (Legacy TxPool)

Since the data carried by Blob transactions is processed differently from that of other transactions, a separate transaction pool is used for handling them. Although the types of other transactions are inconsistent, the synchronization and packaging processes between different nodes are fundamentally the same, so they are processed in the same transaction pool. Transactions in the pool are submitted by the owners of external EOAs, and there are two ways to submit transactions to the transaction pool:

  • SendTransaction

  • SendRawTransaction

SendTransaction is when the client sends an unsigned transaction object, and the node will sign the transaction using the private key corresponding to the initiating address from the transaction. On the other hand, SendRawTransaction requires the transaction to be signed in advance, and then the signed transaction is submitted to the node. This method is more commonly used, as wallets like Metamask and Rabby utilize this approach.

Taking SendRawTransaction as an example, after the node starts, it will launch an API module to handle various external API requests, with SendRawTransaction being one of the APIs. The source code can be found in internal/ethapi/api.go:

  • func (api *TransactionAPI) SendRawTransaction(ctx context.Context, input hexutil.Bytes) (common.Hash, error) { tx := new(types.Transaction) if err := tx.UnmarshalBinary(input); err != nil { return common.Hash{}, err } return SubmitTransaction(ctx, api.b, tx)``}

04 Core Data Structures

For the transaction module, the core data structures consist of two parts: one part represents the transaction data structure itself, and the other part represents the transaction pool structure for temporarily storing transactions. Since transactions need to propagate between different nodes in the transaction pool, the implementation in the transaction pool relies on the underlying p2p protocol.

Transaction Structure

The Transaction type from core/types/transaction.go is used to uniformly represent all transaction types:

  • type Transaction struct { inner TxData // The actual transaction data is stored here time time.Time //....``}

TxData is an interface type that defines the property retrieval methods that all transaction types need to implement. However, for transactions like LegacyTxType, many fields may not exist, so previously existing fields will be used as substitutes or return empty values directly:

  • type TxData interface { txType() byte // Transaction type copy() TxData // Create a deep copy of the transaction data chainID() *big.Int // Chain ID, used to distinguish different Ethereum networks accessList() AccessList // Precompiled access list, used to optimize gas consumption (introduced in EIP-2930) data() []byte // Input data for the transaction, used for contract calls or creation gas() uint64 // Gas limit, indicating the maximum gas that can be consumed by the transaction gasPrice() *big.Int // Price per unit of gas (for Legacy transactions) gasTipCap() *big.Int // Tip cap (for EIP-1559 transactions) gasFeeCap() *big.Int // Total fee cap (for EIP-1559 transactions) value() *big.Int // Amount of ETH sent in the transaction nonce() uint64 // Transaction sequence number, used to prevent replay attacks to() *common.Address // Recipient address, nil if it's a contract creation rawSignatureValues() (v, r, s *big.Int) // Raw signature values (v, r, s) setSignatureValues(chainID, v, r, s *big.Int) // Set signature values effectiveGasPrice(dst *big.Int, baseFee *big.Int) *big.Int // Calculate the actual gas price (considering baseFee) encode(*bytes.Buffer) error // Encode the transaction into a byte stream decode([]byte) error // Decode the transaction from a byte stream sigHash(*big.Int) common.Hash // Hash of the transaction to be signed``}

In addition to the details that each transaction must have, each newly added transaction has its own extension parts.

In Blob transactions:

  • BlobFeeCap: The maximum fee cap for each blob of data, similar to maxFeePerGas, but specifically for calculating the costs of blob data.

  • BlobHashes: An array storing the hash values of all blob data, which will be stored in the execution layer to prove the integrity and authenticity of the Blob data.

  • Sidecar: Contains the actual blob data and its proof. This data will not be stored in the execution layer but will be stored in the consensus layer for a period of time and will not be encoded into the transaction.

In SetCode transactions:

  • AuthList: An authorization list used to implement a multi-signature mechanism for contract code, helping EOAs gain the capabilities of smart contracts.

All transaction types need to implement TxData, and the differentiated processing for each type of transaction will be implemented internally within the transaction type. The benefit of this interface-oriented approach is that it allows for the easy addition of new transaction types without modifying the current transaction processing flow.

Transaction Pool Structure

Similar to the transaction structure, the transaction pool also adopts the same design pattern, using TxPool from core/txpool/txpool.go to uniformly manage the transaction pool, where SubPool is an interface that each specific implementation of the transaction pool must fulfill:

  • type TxPool struct { subpools []SubPool // Specific implementations of the transaction pool chain BlockChain // ...``} type LegacyPool struct { config Config // Transaction pool parameter configuration chainconfig *params.ChainConfig // Blockchain parameter configuration chain BlockChain // Blockchain interface gasTip atomic.Pointer[uint256.Int] // Current accepted minimum gas tip txFeed event.Feed // Transaction event publishing and subscription system signer types.Signer // Transaction signature validator pending map[common.Address]*list // Currently processable transactions queue map[common.Address]*list // Temporarily unprocessable transactions //...``} type BlobPool struct { config Config // Transaction pool parameter configuration reserve txpool.AddressReserver // store billy.Database // Persistent data storage for transaction metadata and blob data stored uint64 // limbo *limbo // signer types.Signer // chain BlockChain // index map[common.Address][]*blobTxMeta // spent map[common.Address]*uint256.Int // //...``}

Currently, the two transaction pools that implement SubPool are:

  • BlobTxPool: Used to manage Blob transactions.

  • LegacyTxPool: Used to manage other transactions besides Blob transactions.

The reason Blob transactions need to be managed separately from other transactions is that Blob transactions may carry a large amount of blob data, while other transactions can be managed and synchronized directly in memory. The blob data in Blob transactions requires persistent storage, so the same management approach as other transactions cannot be used.

05 Fee Mechanism

Since Ethereum itself cannot handle the halting problem, it uses the Gas mechanism to prevent certain malicious attacks. Additionally, Gas itself serves as the transaction fee for users, which are the initial two purposes of Gas.

After years of development, Gas, in addition to the above two purposes, has become an important component of Ethereum's economic model, controlling the issuance of ETH and helping Ethereum achieve deflation. It can even dynamically adjust the traffic of the Ethereum network, enhancing user experience.

Ethereum's fee mechanism uses Gas to achieve various functions, including maintaining network security and balancing the economic model.

Gas

In Ethereum, every operation executed on the EVM during transaction processing requires the consumption of Gas, such as using memory, reading data, writing data, etc. Some operations consume more Gas than others; for example, transferring ETH requires 21,000 Gas. In each transaction, a maximum amount of Gas that can be consumed must be set. If the Gas runs out, the transaction execution ends, and the Gas consumed during this process is not refunded. This mechanism helps address the halting problem in Ethereum.

The size of blocks in Ethereum is also limited by Gas rather than a specific size unit. The total Gas consumed by all transactions within a block cannot exceed the block's Gas limit. Gas is merely a unit of measurement during the EVM execution process, used to pay ETH for the Gas consumed by each transaction. The price of Gas is typically expressed in Gwei, where 1 Ether = 10^9 Gwei.

In the current Ethereum network, the size limit for a block is 36M Gas. However, there is significant community support for increasing the block Gas limit to 60M, which is considered a reasonable choice. This increase would enhance the network's capacity without compromising its security, and it is currently being tested on the testnet. Some in the community also argue that solely using Gas limits to control block size is unreasonable and that byte size limits should be introduced. These discussions are ongoing within the community.

EIP-1559

After the introduction of the EIP-1559 mechanism, the previous GasPrice was split into Base Fee and Priority Fee (maxPriorityFeePerGas). The Base Fee is entirely burned to control the growth rate of ETH in Ethereum, while the Priority Fee is given to the validators of the block. Users can set maxFeePerGas in their transactions to ensure that the final payment is capped.

To ensure a transaction's success, it is necessary to guarantee that maxFeePerGas ≥ Base Fee + Priority Fee; otherwise, the transaction will fail, and the fees will not be refunded. The actual cost incurred by the user will be (Base Fee + Priority Fee) × Gas Used, and any excess fees will be refunded to the address that initiated the transaction.

The Base Fee is dynamically adjusted based on the actual Gas usage in the block. Half of the maximum Gas limit of the block is referred to as the target. If the actual usage of the previous block exceeds the target, the Base Fee for the current block will increase. Conversely, if the previous block's Gas usage is below the target, the Base Fee will decrease; otherwise, it remains unchanged.

Blob Transaction Fee Mechanism

The fee settlement for Blob transactions is divided into two parts: one part uses EIP-1559 to adjust the Base Fee alongside other transactions, while the other part involves an independent Blob Fee mechanism for the Blob data within Blob transactions. The target value is half of the maximum number of blobs and is adjusted based on the usage of Blob data blocks. However, a separate Priority Fee is not set because Blob transactions can directly set the Priority Fee in the transaction to encourage faster packaging.

06 Source Code Analysis of Transaction Processing Flow

Having detailed the design and implementation of the transaction mechanism in Ethereum, we will now analyze the code to provide a detailed overview of how transactions are specifically implemented in Geth, including the processing flow throughout the transaction's lifecycle.

Transaction Submission

Whether submitting a transaction via SendTransaction or SendRawTransaction, the SubmitTransaction function in internal/ethapi/api.go is called to submit the transaction to the transaction pool.

In this function, two basic checks are performed on the transaction: one checks whether the gas fee is reasonable, and the other checks whether the transaction complies with the EIP-155 standard. EIP-155 addresses the cross-chain transaction replay issue by introducing the chainID parameter in the transaction signature. This check ensures that when the node is configured with EIP155Required enabled, all transactions submitted to the transaction pool must meet this standard.

After completing the checks, the transaction is submitted to the transaction pool, with the addition logic implemented in eth/api_backend.go's SendTx:

In the transaction pool, the Filter method is used to match the transaction to the corresponding transaction pool. Currently, there are two transaction pool implementations: if it is a blob transaction, it will be placed in BlobPool; otherwise, it will go into LegacyPool:

At this point, the transaction submitted by the EOA has been placed in the transaction pool, and it will begin to propagate within the pool, entering the subsequent transaction packaging and execution process.

If a new transaction is sent before the original transaction is packaged, and the new transaction sets a new gasPrice and gasLimit, the original transaction in the pool will be deleted and replaced with the new gasPrice and gasLimit before being returned to the transaction pool. This method can also be used to cancel transactions that the user does not wish to execute.

  • func (api *TransactionAPI) Resend(ctx context.Context, sendArgs TransactionArgs, gasPrice *hexutil.Big, gasLimit *hexutil.Uint64) (common.Hash, error) { if sendArgs.Nonce == nil { return common.Hash{}, errors.New("missing transaction nonce in transaction spec") } if err := sendArgs.setDefaults(ctx, api.b, false); err != nil { return common.Hash{}, err } matchTx := sendArgs.ToTransaction(types.LegacyTxType) // Before replacing the old transaction, ensure the _new_ transaction fee is reasonable. price := matchTx.GasPrice() if gasPrice != nil { price = gasPrice.ToInt() } gas := matchTx.Gas() if gasLimit != nil { gas = uint64(*gasLimit) } if err := checkTxFee(price, gas, api.b.RPCTxFeeCap()); err != nil { return common.Hash{}, err } // Iterate the pending list for replacement pending, err := api.b.GetPoolTransactions() if err != nil { return common.Hash{}, err } for _, p := range pending { wantSigHash := api.signer.Hash(matchTx) pFrom, err := types.Sender(api.signer, p) if err == nil && pFrom == sendArgs.from() && api.signer.Hash(p) == wantSigHash { // Match. Re-sign and send the transaction. if gasPrice != nil && (*big.Int)(gasPrice).Sign() != 0 { sendArgs.GasPrice = gasPrice } if gasLimit != nil && *gasLimit != 0 { sendArgs.Gas = gasLimit } signedTx, err := api.sign(sendArgs.from(), sendArgs.ToTransaction(types.LegacyTxType)) if err != nil { return common.Hash{}, err } if err = api.b.SendTx(ctx, signedTx); err != nil { return common.Hash{}, err } return signedTx.Hash(), nil } } return common.Hash{}, fmt.Errorf("transaction %#x not found", matchTx.Hash())``}

Transaction Broadcasting

After a node receives a transaction submitted by an EOA, it needs to propagate it across the network. The txpool (core/txpool/txpool.go) provides the SubscribeTransactions method, which allows subscribing to new events in the transaction pool. The Blob transaction pool and Legacy transaction pool implement subscription differently:

  • func (p *TxPool) SubscribeTransactions(ch chan- core.NewTxsEvent, reorgs bool) event.Subscription { subs := make([]event.Subscription, len(p.subpools)) for i, subpool := range p.subpools { subs[i] = subpool.SubscribeTransactions(ch, reorgs) } return p.subs.Track(event.JoinSubscriptions(subs...))``}

The BlobPool distinguishes between two sources of events:

  • discoverFeed: contains only newly discovered transactions

  • insertFeed: contains all transactions, including those re-entering the pool due to reorganization

  • func (p *BlobPool) SubscribeTransactions(ch chan- core.NewTxsEvent, reorgs bool) event.Subscription { if reorgs { return p.insertFeed.Subscribe(ch) } else { return p.discoverFeed.Subscribe(ch) } }

The LegacyPool does not differentiate between new transactions and reorganized transactions; it uses a single txFeed to send all transaction events.

  • func (pool *LegacyPool) SubscribeTransactions(ch chan- core.NewTxsEvent, reorgs bool) event.Subscription { return pool.txFeed.Subscribe(ch)``}

In summary, SubscribeTransactions decouples the transaction pool from other components through an event mechanism. This subscription can be used by multiple modules, such as transaction broadcasting, transaction packaging, and external RPC, all of which need to listen to this process and respond accordingly.

At the same time, the p2p module (eth/handler.go) continuously listens for new transaction events. If a new transaction is received, it will send a broadcast to propagate the transaction:

  • // eth/handler.go will broadcast the transaction over the p2p network after a new transaction is generated``func (h *handler) txBroadcastLoop() { defer h.wg.Done() for { select { case event := -h.txsCh: // Listening for new transaction information here h.BroadcastTransactions(event.Txs) case -h.txsSub.Err(): return } } }

When broadcasting transactions, they need to be classified. If it is a blob transaction or a transaction that exceeds a certain size, it cannot be propagated directly. For ordinary transactions, they are marked as propagable. Then, the node looks for peers that do not have this transaction. If the node can broadcast directly, it is marked as true. This process is also implemented in the BroadcastTransactions method:

Once the classification of transactions is completed based on the above principles, transactions that can be propagated directly are sent out, while blob transactions or large transactions only broadcast their hashes, retrieving the full transaction data when needed.

Transactions that only send hashes during broadcasting will be placed in this field of the peer node:

New transactions will be broadcasted through the p2p module, and the node will also receive new transactions from the p2p network. When initializing the Ethereum instance in eth/backend.go, the p2p module is initialized, adding the transaction pool interface. Once the p2p module is running, it will parse transaction requests from p2p messages and add them to the transaction pool.

Specifically, when instantiating the handler, the method for obtaining transactions from other nodes is specified. It will use the TxFetcher in eth/fetcher to retrieve remote transactions. The TxFetcher will use the fetchTx method here to obtain remote transactions, which actually calls the RequestTxs method implemented in the eth/protocols/eth protocol to fetch transactions:

  • // eth/backend.go New function``if eth.handler, err = newHandler(&handlerConfig{ NodeID: eth.p2pServer.Self().ID(), Database: chainDb, Chain: eth.blockchain, TxPool: eth.txPool, Network: networkID, Sync: config.SyncMode, BloomCache: uint64(cacheLimit), EventMux: eth.eventMux, RequiredBlocks: config.RequiredBlocks, }); err != nil { return nil, err } // eth/handler.go newHandler function, registering the process to obtain new transactions fetchTx := func(peer string, hashes []common.Hash) error { p := h.peers.peer(peer) if p == nil { return errors.New("unknown peer") } return p.RequestTxs(hashes) // Request transactions from other nodes } addTxs := func(txs []*types.Transaction) []error { return h.txpool.Add(txs, false) // Add transactions to the transaction pool } h.txFetcher = fetcher.NewTxFetcher(h.txpool.Has, addTxs, fetchTx, h.removePeer) // eth/handler_eth.go Handle method, after receiving new transactions, they will be added to the transaction pool for _, tx := range *packet { if tx.Type() == types.BlobTxType { return errors.New("disallowed broadcast blob transaction") } } return h.txFetcher.Enqueue(peer.ID(), *packet, false) // eth/fetcher/tx_fetcher.go's Handle method will call the registered addTxs to add the transactions to the pool for j, err := range f.addTxs(batch) { //.... }

The RequestTxs method sends a GetPooledTransactionsMsg message and then receives a response from other nodes with the PooledTransactionsMsg. The backend's Handle method processes this, calling the txFetcher’s Enqueue method, which ultimately calls the addTxs method to add the transactions obtained from other nodes to the transaction pool:

The transaction pool also has a lazy loading design, implemented through core/txpool/subpool.go's LazyTransaction. This lazy loading mechanism reduces memory usage and improves transaction processing efficiency. It stores key metadata of transactions and only loads the full transaction data when truly needed, playing an important role when Ethereum processes a large number of transactions. This design is particularly suitable for scenarios like transaction pools and block packaging, where most transactions may ultimately not be included in a block, thus not requiring the full loading of all transaction data.

  • type LazyTransaction struct { Pool LazyResolver // Transaction resolver to pull the real transaction up Hash common.Hash // Transaction hash to pull up if needed Tx *types.Transaction // Transaction if already resolved Time time.Time // Time when the transaction was first seen GasFeeCap *uint256.Int // Maximum fee per gas the transaction may consume GasTipCap *uint256.Int // Maximum miner tip per gas the transaction can pay Gas uint64 // Amount of gas required by the transaction BlobGas uint64 // Amount of blob gas required by the transaction``} func (ltx *LazyTransaction) Resolve() *types.Transaction { if ltx.Tx != nil { return ltx.Tx } return ltx.Pool.Get(ltx.Hash)``}

Additionally, since Ethereum is a permissionless network, nodes may receive some malicious requests from the network. In extreme cases, nodes may face DDoS attacks, so nodes use a series of methods to prevent malicious attacks from the network:

  • Basic transaction validation

  • Node resource limitations

  • Transaction eviction mechanism

  • p2p network layer protection

Taking the LegacyPool as an example (the BlobPool also has similar mechanisms), before a transaction is added to the transaction pool, it first undergoes basic validation. In the ValidateTransaction method in core/txpool/validation.go, basic validation is performed on the transaction, including whether the transaction type, size, gas, etc., meet the requirements. If they do not meet the requirements, the transaction will be rejected.

The transaction size is regulated using Slots, defined in core/txpool/legacypool/legacypool.go:

  • const ( txSlotSize = 32 * 1024 txMaxSize = 4 * txSlotSize // 128KB``)

Each transaction cannot exceed 4 Slots, and there are maximum Slot limits for each account and the entire node. For accounts, once the limit is reached, no new transactions can be submitted. For nodes, once the limit is reached, old transactions need to be evicted. The truncatePending method in core/txpool/legacypool/legacypool.go will fairly evict transactions to prevent a single account from occupying too many resources in the transaction pool:

  • type Config struct { AccountSlots uint64 GlobalSlots uint64 }

At the network layer, for blob transactions or transactions that exceed a certain size, the transaction content will not be propagated directly over the network; only the transaction hash will be propagated, thus avoiding excessive data transmission over the network that could lead to DDoS attacks.

Transaction Packaging

After a transaction is submitted to the transaction pool, it will propagate among nodes in the Ethereum network. When a validator on a certain node is selected as the block proposer, the validator will delegate the consensus layer and execution layer to construct the block.

The validator will first trigger the block construction process from the consensus layer. After receiving the block construction request, the consensus layer will call the engineAPI of the execution layer to construct the block, with the implementation of engineAPI found in eth/catalyst/api.go. The consensus layer will first call the ForkchoiceUpdated API to send the block construction request. There are multiple versions of ForkchoiceUpdated, and the specific version called depends on the current network version. After the call is completed, it will return a PayloadID, which will then be used to call the GetPayload corresponding version API to obtain the block construction result.

Regardless of which version of ForkchoiceUpdated is called, it ultimately calls the forkchoiceUpdated method to construct the block:

In the forkchoiceUpdated method, the current state of the execution layer is validated. If the execution layer is currently synchronizing blocks or if the finalized block does not meet expectations, this method will directly return an error message to the consensus layer, indicating that block construction has failed:

After validating the information from the execution layer, the BuildPayload method in miner/miner.go will be called to construct the block. The specific operations for constructing the block are completed in the generateWork method in miner/payload_building.go. However, it is important to note that after calling this method, an empty payload is first generated, and this payloadID is returned to the consensus layer. At the same time, a goroutine is started to actually complete the block packaging process. This goroutine will continuously look for higher-value transactions in the transaction pool, and after each repackaging of transactions, it will update the payload.

The transaction packaging is completed through the fillTransactions method in miner/worker.go, which essentially calls the Pending method of the txpool to obtain the transactions to be packaged:

Before the slot ends, the consensus layer will call the getPayload API to obtain the final packaged block. If the submitted transaction is included in this block, the transaction will be executed by the EVM and will change the state database. If it is not included this time, it will wait for the next packaging.

Transaction Execution

During the transaction packaging process, the transactions will also be executed in the EVM, obtaining the state changes after the block transactions are completed. This is also done in the generateWork function, where the current block execution environment variables are prepared, mainly by obtaining the latest block and the latest state database:

Here, the state represents the state database:

This forms a structure of StateDB → stateObjects → stateAccount, representing the complete state database, the collection of account objects, and individual account objects, respectively. In the StateObject structure, dirtyStorage indicates the state that has changed after the current transaction execution, pendingStorage indicates the state that has changed after the current block execution, and originStorage indicates the original state. Therefore, these three states from newest to oldest are dirtyStorage → pendingStorage → originStorage. A detailed analysis of storage can be found in previous discussions about storage:

In the New method of eth/backend.go, the configuration of the transaction pool is loaded at startup, which includes a Locals configuration. The addresses in this configuration are treated as local addresses, and transactions submitted from these local addresses will be prioritized for processing.

After obtaining the current environment variables, transactions can be executed. First, all transactions to be packaged are retrieved, and local transactions are filtered out to distinguish between local and normal transactions. Then, local and normal transactions are packaged separately based on fees from high to low. The specific execution of transactions is carried out in the commitTransactions method in miner/worker.go:

Ultimately, the ApplyTransaction function is called, which executes the transaction in the EVM and modifies the state database:

  • func ApplyTransaction(evm *vm.EVM, gp *GasPool, statedb *state.StateDB, header *types.Header, tx *types.Transaction, usedGas *uint64) (*types.Receipt, error) { msg, err := TransactionToMessage(tx, types.MakeSigner(evm.ChainConfig(), header.Number, header.Time), header.BaseFee) if err != nil { return nil, err } // Create a new context to be used in the EVM environment return ApplyTransactionWithEVM(msg, gp, statedb, header.Number, header.Hash(), tx, usedGas, evm)``}

Transaction Validation

The situations discussed above are all about the process of transactions being packaged into blocks. In most cases, nodes only validate blocks that have already been packaged, rather than packaging blocks themselves.

After the consensus layer synchronizes to a block, it uses the engine API to transfer the latest synchronized block to the execution layer. The engine_NewPayload series of methods are used. These methods ultimately call the newPayload method, where the consensus layer's payload is assembled into a block:

Then it checks whether this block already exists. If it does, it directly returns a cancellation status:

If the current execution layer is still in synchronization status, it cannot receive new blocks for the time being:

If the above conditions are met, the block will begin to be inserted into the blockchain. It is important to note that when inserting a block, the chain head is not directly specified, as the decision regarding the chain head involves choosing between chain forks, which needs to be determined by the consensus layer:

The consensus layer will call the forkChoiceUpdated API to invoke the SetCanonical method in core/blockchain.go to determine the block header:

Another situation that can trigger the setting of the block header is block reorganization. Block reorganization will execute the reorg method in core/blockchain.go, where the currently latest confirmed block header will also be set.

Returning to the block execution process, the InsertBlockWithoutSetHead method in core/blockchain.go will call the insertChain method. In this method, a series of condition checks will be performed, and after the checks are completed, the block will begin to be processed:

In the specific Process, the handling logic is clear. Similar to the previous transaction packaging process, transactions are continuously executed in the EVM, and the state database is modified. The difference from packaging is that here, the transactions in the new block are simply replayed without needing to fetch from the transaction pool.

07 Summary

Transactions are the only way to drive state changes in Ethereum. The processing of transactions in Ethereum goes through multiple stages. Transactions must first be validated, then submitted to the transaction pool, propagated through the p2p network among different nodes, packaged into blocks by block proposers, and finally synchronized by other nodes, executing the transactions in the block locally and synchronizing state changes.

With the continuous development of the Ethereum protocol, it has evolved from initially supporting only one type of transaction to currently supporting five types of transactions. These different types of transactions allow Ethereum to adapt to different roles, serving as a platform for DApps, as well as a settlement layer for Layer 2 or other off-chain scaling solutions. The recently added EIP-7702 has prepared Ethereum technically for large-scale adoption.

08 References

[1]https://ethereum.org/zh/developers/docs/transactions/

[2]https://hackmd.io/@danielrachi/engineapi_

[3]https://github.com/ethereum/go-ethereum/commit/c8a9a9c0917dd57d077a79044e65dbbdd421458b

[4]https://pumpthegas.org/

[5]https://github.com/ethereum/EIPs/pull/9698

免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

Bybit: $50注册体验金,$30,000储值体验金
Ad
Share To
APP

X

Telegram

Facebook

Reddit

CopyLink