Skip to content

walnuthq/ethdebug-stats

Repository files navigation

Ethdebug Statistics Tool for Solidity Compilers

A tool for evaluating the quality of Ethdebug format debug information generated by Solidity compilers, inspired by LLVM's dwarfdump --statistics.

Purpose

This tool analyzes Ethdebug JSON files generated by the Solidity compiler and provides comprehensive statistics about debug information quality. It helps compiler developers track improvements in debug info generation and ensures debugging tools have sufficient information.

Motivation

We took motivation from LLVM's ecosystem, for example there are useful tools that track quality of debug info generated (DWARF), so the either progress or regression is clear:

Installation

# Create virtual environment
python3 -m venv venv
source venv/bin/activate

# Install in development mode
pip install -e .

Usage

# JSON output (default)
ethdebug-stats path/to/ethdebug.json

# Human-readable text output
ethdebug-stats path/to/ethdebug.json --format text

# Save to file
ethdebug-stats path/to/ethdebug.json -o stats.json

Understanding the Output

File Metadata

  • environment: Whether this is "create" (constructor) or "runtime" (deployed) bytecode
  • contract_name: Name of the analyzed contract

Debug Info Quality Metrics

Source Mapping Coverage

The most important metrics for debug information quality:

  • #instructions: Total number of EVM instructions in the bytecode
  • #instructions_with_source: Instructions that can be mapped back to source code
  • #instructions_without_source: Instructions with no source mapping (cannot be debugged at source level)
  • source_coverage_percent: Percentage of instructions with source mapping (higher is better)

Example: 100% coverage means every bytecode instruction can be traced back to Solidity source code.

Source Information

Details about source code references:

  • #unique_source_files: Number of different source files referenced
  • #estimated_functions: Estimated number of functions (currently simplified)

Context Completeness

Measures the richness of debug context available:

  • #instructions_with_context: Instructions with any debug context
  • #instructions_with_code_context: Instructions with source code mapping (currently the only context Solidity provides)
  • #instructions_with_variables: Instructions with variable debug info (future enhancement)
  • #instructions_with_frame: Instructions with call frame info (future enhancement)

Variable Debug Info (Future)

When Solidity adds variable debugging support:

  • #variables: Total variables with debug info
  • #variables_with_location: Variables with storage/memory/stack location info
  • #variables_with_type: Variables with type information
  • #state_variables: Contract state variables (storage)
  • #local_variables: Local function variables
  • #function_parameters: Function parameters

Coverage Distribution

Similar to DWARF statistics, shows the distribution of debug coverage:

  • 0%: Instructions with no debug info at all
  • 100%: Instructions with complete debug info
  • Other buckets ([10-20%), etc.) for partial coverage (future use)

Example Output

$ ethdebug-stats /tmp/TestContract_ethdebug.json --format text                                             
Ethdebug Statistics for /tmp/TestContract_ethdebug.json
============================================================
Environment: create
Contract: TestContract

DEBUG INFO QUALITY METRICS:
------------------------------

Source Mapping Coverage:
  Total instructions: 132
  With source mapping: 132
  Without source mapping: 0
  Coverage: 100.0%

Source Information:
  Unique source files: 1
  Estimated functions: 1

Context Completeness:
  Instructions with context: 132
  With code context: 132
  With variables: 0
  With frame info: 0

Coverage Distribution:
  100%: 132 instructions

or,

$ ethdebug-stats /tmp/TestContract_ethdebug.json
{
  "version": 1,
  "file": "/tmp/TestContract_ethdebug.json",
  "format": "ethdebug",
  "environment": "create",
  "contract_name": "TestContract",
  "#instructions": 132,
  "#instructions_with_source": 132,
  "#instructions_without_source": 0,
  "source_coverage_percent": 100.0,
  "#unique_source_files": 1,
  "#instructions_with_context": 132,
  "#instructions_with_code_context": 132,
  "#instructions_with_variables": 0,
  "#instructions_with_frame": 0,
  "#variables": 0,
  "#variables_with_location": 0,
  "#variables_with_type": 0,
  "#state_variables": 0,
  "#local_variables": 0,
  "#function_parameters": 0,
  "#estimated_functions": 1,
  "coverage_distribution": {
    "0%": 0,
    "(0-10%)": 0,
    "[10-20%)": 0,
    "[20-30%)": 0,
    "[30-40%)": 0,
    "[40-50%)": 0,
    "[50-60%)": 0,
    "[60-70%)": 0,
    "[70-80%)": 0,
    "[80-90%)": 0,
    "[90-100%)": 0,
    "100%": 132
  }
}

Interpreting Results

Good Debug Info

  • source_coverage_percent = 100%
  • All instructions have source mappings
  • Multiple source files tracked (for complex contracts)

Poor Debug Info

  • Low source_coverage_percent
  • Many instructions in the 0% coverage bucket
  • Instructions without context

Current Limitations

Solidity currently only generates:

  • Source code mappings (code context)

Not yet supported:

  • Variable location/type information
  • Call frame information
  • Function-level debugging data

Use Cases

  1. Compiler Development: Track debug info quality improvements across compiler versions
  2. Build Verification: Ensure contracts are compiled with debug symbols
  3. Debugging Tool Development: Verify sufficient debug info for tool requirements
  4. Quality Metrics: Monitor debug info coverage in CI/CD pipelines

Comparison with DWARF Statistics

As we mentioned, this tool is inspired by LLVM's dwarfdump --statistics which measures:

  • Variable location coverage
  • Function debug info completeness
  • Line number information

Similarly, ethdebug-stats measures:

  • Source mapping coverage
  • Context completeness
  • Future: variable debug info quality

Future Enhancements

  • Variable location coverage (when Solidity adds support)
  • Function-level statistics with names
  • Multi-file batch processing
  • Comparison between compiler versions
  • HTML report generation
  • Integration with CI/CD pipelines

About

Track ETHDebug quality.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages