A set of useful tools to work with polars, providing convenient extensions for DataFrame manipulation, column operations, label encoding, and more.
Install kra from PyPI using pip:
pip install kraTo build and install a local version for development or testing:
pip install build
python -m build
pip install dist/kra-*.whlThis will build the wheel and install it into your current environment.
- DataFrame and Series extensions: Add new methods to polars DataFrames and Series.
- Column utilities: Easily rename, check, and transform DataFrame columns.
- Label encoding: Encode string labels as categorical/integer values.
- Dict-of-dicts conversion: Convert between DataFrames and nested dictionaries.
Convert a DataFrame to a dict of dicts using a column as the key:
import polars as pl
import kra
df = pl.DataFrame({
"id": [1, 2, 3],
"name": ["Alice", "Bob", "Charlie"]
})
dod = df.to_dod("id")
# {1: {'id': 1, 'name': 'Alice'}, 2: {'id': 2, 'name': 'Bob'}, ...}
# Convert back:
df2 = kra.from_dod(dod, "id")Transform column names to different cases:
import polars as pl
import kra
df = pl.DataFrame({
"First Name": [1, 2],
"Last Name": [3, 4]
})
df_lower = df.cols.to_lowercase()
df_camel = df.cols.to_camelcalse()
df_snake = df.cols.to_snakecase()Encode string labels as integers:
import polars as pl
import kra
df = pl.DataFrame({
"label": ["cat", "dog", "cat", "bird"]
})
# Series API
encoded = df["label"].label.encode()
# Expression API (for use in with_columns, etc.)
df2 = df.with_columns(
pl.col("label").label.encode().alias("encoded_label")
)Drop columns of type Null:
import polars as pl
import kra
df = pl.DataFrame({
"a": [1, 2, 3],
"b": [None, None, None]
})
df_clean = df.drop_null_cols()Create a DataFrame from a numpy array:
import kra
import numpy as np
data = np.array([[1, 2], [3, 4]])
df = kra.from_arraylike(data, schema=["x", "y"], orient="col")kra.from_dod: Create DataFrame from dict of dicts.kra.to_dod: Convert DataFrame to dict of dicts.kra.Cols: DataFrame column utilities (access viadf.cols).kra.LabelSeries: Series label encoding (access viaseries.label).kra.LabelExpr: Expression label encoding (access viapl.col(...).label).kra.drop_null_cols: Remove columns of type Null.kra.from_arraylike: Create DataFrame from array-like objects.
For more, see the intro.ipynb notebook.
kra includes a Rust extension for fast label encoding, accessible via the Python API.
MIT License. See LICENSE for details.