Skip to content

Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions, dex trades and more. Data is available in Dex.Guru Data Warehouse https://warehouse.dex.guru

License

Notifications You must be signed in to change notification settings

dex-guru/ethereum-etl

 
 

Repository files navigation

Guru Network Ethereum ETL

Ethereum ETL lets you convert blockchain data into convenient formats like CSVs and relational databases.

Do you just want to query Ethereum data right away? Use the public dataset in Guru Warehouse.

Quickstart

Copy .env.sample, rename with .env and set CHAIN_ID and PROVIDER_URL

Run with docker-compose as a streamer

By default, Ethereum ETL will export entities to clickhouse. You can change the destination in the .env file with OUTPUT env variable.

  1. Install Docker: https://docs.docker.com/get-docker/

  2. Install Docker Compose: https://docs.docker.com/compose/install/

  3. Run the following command to start clickhouse:

    docker-compose up -d clickhouse
  4. After initializing clickhouse, run the following command to create database. Name of the database is specified in the .env file.:

    docker-compose up init-ch-db
  5. You can specify entities to export in the .env file. See supported entities here.

  6. Run the following command to start the streamer:

    docker-compose up indexer

Running certain command

Install Ethereum ETL:

pip3 install -r requirements.txt

IF you want to use clickhouse as a destination, make sure to apply migrations:

CLICKHOUSE_URL=clickhouse+http://default:@localhost:8123/ethereum alembic upgrade head 

Export blocks and transactions (Schema, Reference):

ethereumetl export_blocks_and_transactions --start-block 0 --end-block 500000 \
--blocks-output blocks.csv --transactions-output transactions.csv \
--provider-uri https://mainnet.infura.io/v3/${INFURA_API_KEY}

Export ERC20 and ERC721 transfers (Schema, Reference):

ethereumetl export_token_transfers --start-block 0 --end-block 500000 \
--provider-uri file://$HOME/Library/Ethereum/geth.ipc --output token_transfers.csv

Export traces (Schema, Reference):

ethereumetl export_traces --start-block 0 --end-block 500000 \
--provider-uri file://$HOME/Library/Ethereum/parity.ipc --output traces.csv

Stream blocks, transactions, logs, token_transfers continually to console (Reference):

ethereumetl stream --start-block 500000 -e block,transaction,log,token_transfer --log-file log.txt \
--provider-uri https://mainnet.infura.io/v3/7aef3f0cd1f64408b163814b22cc643c

Find other commands here.

Supported export destinations here.

Linters/formatters

Install

pip install black ruff mypy

Run

TL;DR all-at-once run and fix:

ruff check --fix . && black . && mypy .

Or one-by-one:

  • Check and auto-fix with: ruff --fix .
  • Check typing: mypy .
  • Auto-format all: black .

Useful Links

Running Tests

export ETHEREUM_ETL_RUN_SLOW_TESTS=True
export PROVIDER_URL=<your_porvider_uri>
pytest -vv

Projects using Ethereum ETL

  • Google - Public BigQuery Ethereum datasets
  • Nansen - Analytics platform for Ethereum

About

Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions, dex trades and more. Data is available in Dex.Guru Data Warehouse https://warehouse.dex.guru

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.5%
  • Other 0.5%