An async http(s) ClickHouse client for python 3.6+ supporting type conversion in both directions, streaming, lazy decoding on select queries, and a fully typed interface
> pip install aiochclient
Or to install with extra requirements for speedup:
> pip install aiochclient[speedups]
Installing with [speedups]
adds the following:
- cChardet
- aiodns for
aiohttp
speedup - ciso8601 for ultra fast datetime parsing while decoding data from ClickHouse.
Additionally the installation process attempts to use Cython for a speed boost (roughly 30% faster).
aiochclient
needs aiohttp.ClientSession
to connect to ClickHouse:
from aiochclient import ChClient
from aiohttp import ClientSession
async def main():
async with ClientSession() as s:
client = ChClient(s)
assert await client.is_alive() # returns True if connection is Ok
await client.execute(
"CREATE TABLE t (a UInt8, b Tuple(Date, Nullable(Float32))) ENGINE = Memory"
)
For INSERT queries you can pass values as *args
. Values should be
iterables:
await client.execute(
"INSERT INTO t VALUES",
(1, (dt.date(2018, 9, 7), None)),
(2, (dt.date(2018, 9, 8), 3.14)),
)
For fetching all rows at once use the
fetch
method:
all_rows = await client.fetch("SELECT * FROM t")
For fetching first row from result use the
fetchrow
method:
row = await client.fetchrow("SELECT * FROM t WHERE a=1")
assert row[0] == 1
assert row["b"] == (dt.date(2018, 9, 7), None)
You can also use
fetchval
method, which returns first value of the first row from query result:
val = await client.fetchval("SELECT b FROM t WHERE a=2")
assert val == (dt.date(2018, 9, 8), 3.14)
With async iteration on the query results stream you can fetch multiple rows without loading them all into memory at once:
async for row in client.iterate(
"SELECT number, number*2 FROM system.numbers LIMIT 10000"
):
assert row[0] * 2 == row[1]
Use fetch
/fetchrow
/fetchval
/iterate
for SELECT queries and execute
or
any of last for INSERT and all another queries.
All fetch queries return rows as lightweight, memory efficient objects. Before
v1.0.0
rows were only returned as tuples. All rows have a full mapping interface, where you can
get fields by names or indexes:
row = await client.fetchrow("SELECT a, b FROM t WHERE a=1")
assert row["a"] == 1
assert row[0] == 1
assert row[:] == (1, (dt.date(2018, 9, 8), 3.14))
assert list(row.keys()) == ["a", "b"]
assert list(row.values()) == [1, (dt.date(2018, 9, 8), 3.14)]
To check out the api docs, visit the readthedocs site.
aiochclient
automatically converts types from ClickHouse to python types and
vice-versa.
ClickHouse type | Python type |
---|---|
UInt8 |
int |
UInt16 |
int |
UInt32 |
int |
UInt64 |
int |
Int8 |
int |
Int16 |
int |
Int32 |
int |
Int64 |
int |
Float32 |
float |
Float64 |
float |
String |
str |
FixedString |
str |
Enum8 |
str |
Enum16 |
str |
Date |
datetime.date |
DateTime |
datetime.datetime |
Decimal |
decimal.Decimal |
Decimal32 |
decimal.Decimal |
Decimal64 |
decimal.Decimal |
Decimal128 |
decimal.Decimal |
IPv4 |
ipaddress.IPv4Address |
IPv6 |
ipaddress.IPv6Address |
UUID |
uuid.UUID |
Nothing |
None |
Tuple(T1, T2, ...) |
Tuple[T1, T2, ...] |
Array(T) |
List[T] |
Nullable(T) |
None or T |
LowCardinality(T) |
T |
aiochclient
uses the
aiohttp.TCPConnector
to determine pool size. By default the pool limit is 100 open connections.
It's highly recommended to use uvloop
and install aiochclient
with
aiochclient[speedups]
for the sake of speed. Some recent benchmarks on our
machines without parallelization:
- 180k-220k rows/sec on SELECT
- 50k-80k rows/sec on INSERT
Note: these benchmarks are system dependent