v2.0.0b0
Highlight
-
🚀 87x performance boost query on Pandas DataFrame, See Benchmark
-
⬆️ Upgrade ClickHouse engine to 24.5
-
🐍 Query on Pandas DataFrame, ArrowTable, Dict or Any Python object Directly!
import chdb
import pandas as pd
df = pd.DataFrame(
{
"a": [1, 2, 3, 4, 5, 6],
"b": ["tom", "jerry", "auxten", "tom", "jerry", "auxten"],
}
)
chdb.query("SELECT b, sum(a) FROM Python(df) GROUP BY b ORDER BY b").show()
- For more: test_query_py.py.
Changes
- Use in-process instead of embedded by @auxten in #222
- 226 crash on mergetree background task cleanup by @auxten in #227
- Replace icu with utf8proc by @auxten in #228
- Start agg step stream according to agg_col_cost and group_by_keys_cost by @auxten in #230
- Fix various issues like 190, 229 by @auxten in #231
- Before 2.0 release Patch #1 by @auxten in #232
- Fix python version check in CMakeLists.txt by @auxten in #233
- Fix mistake config.h by @auxten in #234
- Add tests on query python objects by @auxten in #235
- Add docs for Python Table Engine by @auxten in #238
Full Changelog: v1.4.1...v2.0.0b0