Understanding Ethereum Transaction Data Encoding and Structure

·

When you need to write data to the Ethereum blockchain, the process is called a transaction. Reading data from the blockchain is called a call. Unlike traditional database writes, data intended for the Ethereum blockchain must first be encoded into hexadecimal bytecode. The blockchain stores basic data types such as Bytes32 and Address. Reading this data requires converting the hexadecimal bytecode back into UTF-8 encoded letters or characters, forming meaningful information.

What Does a Typical Transaction Look Like?

A common transaction type is a simple ETH transfer. For example, if Alice sends ETH to Bob on the Ethereum network, the transaction receipt contains key fields: from (the sender), to (the recipient), and value (the amount transferred). The structure for such a transaction is straightforward.

However, transferring ERC-20 tokens via a smart contract requires constructing a more complex transaction. In this case, the from field remains the sender, but the to field becomes the contract address of the ERC-20 token. The specific transfer details, such as the amount, are contained within the inputData field.

The Role of inputData and the EVM

The core of this transaction lies in the inputData. The Ethereum Virtual Machine (EVM), which executes smart contract code, only processes bytecode. The inputData is a neatly formatted hexadecimal string that represents this bytecode, making it human-readable for analysis.

This data typically starts with a MethodID, which is the function signature of the contract method being called—for example, transfer(address _to, uint256 _value). The subsequent parameters are the values for the function's arguments: the recipient's address and the amount of tokens to transfer.

A standard ERC-20 transfer transaction data is structured as: function signature + recipient address + transfer amount.

Example transaction data for an ERC-20 transfer:

0xa9059cbb
000000000000000000000000d0292fc87a77ced208207ec92c3c6549565d84dd
0000000000000000000000000000000000000000000000000de0b6b3a7640000

How is Transaction Data Structured?

The fundamental unit of Ethereum transaction data is a 32-byte (64-character) value. For example, an address type is 40 characters long (excluding the 0x prefix). To fit the 64-character requirement, it is padded with zeros on the left side.

Why left-padding?
In EVM bytecode conventions, static base types are left-padded with zeros to 64 characters, while dynamic types are right-padded.

Common static types include: uint, bool, address, bytes[0-32].
Common dynamic types include: bytes, string, address[], bytes32[].

Encoding Complex Transactions with Dynamic Parameters

Constructing transaction data for simple transfers is relatively easy. However, smart contract interactions can involve complex methods with parameters like static arrays, dynamic arrays, and nested data, making the encoding process more involved.

Consider a function with a mix of basic and dynamic types:
analysisHex(bytes name, bool b, uint[] data, address addr, bytes32[] testData)

The function signature, calculated using tools like Remix or scripts, might be 0x4b6112f8.

If we want to call this function with the values: "Alice", true, [9,8,7,6], "0x26d59ca6798626bf3bcee3a61be57b7bf157290e", ["张三","Bob","老王"], the encoded transaction data becomes a long hexadecimal string.

The complexity arises because dynamic types, whose lengths are unknown upfront, require "placeholders" in the data structure. The actual values for these dynamic parameters are appended after the initial set of parameters, and the placeholders point to the starting position of these values within the overall data payload.

Breakdown of the Encoded Data

Simplifying with Static Arrays

If the function used static arrays instead, the encoding simplifies significantly. For example:
analysisHex(bytes32 name, bool b, uint[4] data, address addr, bytes32[3] testData)

The function signature changes, and the encoded data becomes a direct sequence of the function signature followed by the encoded values of each parameter, in order, with no need for placeholders. Each value is simply padded according to its type (left for static, right for dynamic elements within the static array).

Using Libraries for Encoding

Manually constructing these data strings is complex and error-prone. Libraries like web3j handle this encoding automatically. Their core functions work by:

  1. Building the function signature.
  2. Calculating the total size of the data payload.
  3. Iterating through each parameter:

    • For static types, it encodes and appends the value directly.
    • For dynamic types, it appends a placeholder (the offset to the value's position) and adds the actual encoded value to a separate dynamic data section.
  4. Finally, it appends the dynamic data section to the end of the encoded parameters.

👉 Explore advanced encoding tools and libraries

Constructing the transaction data is just the first step. This raw data is unsigned and has no identity attached. The sender must sign it with their private key. Once signed, the transaction is broadcast to the network, enters the mempool, and awaits validation by miners before being confirmed and added to a block.

Frequently Asked Questions

What is the difference between a transaction and a call?

A transaction writes data to the blockchain, modifies its state, and requires gas fees. A call is a read-only operation that queries data from the blockchain; it does not change state or require gas.

Why is transaction data encoded in hexadecimal?

The Ethereum Virtual Machine (EVM) is designed to process bytecode directly. Hexadecimal encoding is a human-readable representation of this binary bytecode, making it easier for developers to debug and analyze transactions.

What is a function signature (MethodID)?

A function signature is a unique identifier for a smart contract function. It is typically the first 4 bytes (8 hex characters) of the Keccak-256 hash of the function's name and parameter types (e.g., transfer(address,uint256)).

How are dynamic arrays handled in transaction encoding?

Dynamic arrays require a two-part encoding. First, a placeholder (offset) is inserted in the main parameter list to indicate where the array data starts. Then, the actual array data—comprising the array length followed by each element—is appended to the end of the transaction data.

Can I construct transaction data without a library?

While technically possible, manually encoding transactions, especially those with dynamic parameters, is highly complex and prone to errors. It is strongly recommended to use established libraries like web3j, ethers.js, or web3.py for reliability and security.

What happens after the transaction data is constructed?

The constructed data is unsigned. The sender must sign it with their private key to create a valid transaction. This signed transaction is then broadcast to the Ethereum network to be processed and included in a block.