diff --git a/README.md b/README.md index dfbf48e..b19bc06 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ ### Why Dataframe -Dataframe removes the complexity of handling `SQL parsing`, `SQL rewriting`, `Binding`, `SQL Query Planner` etc. Once +Dataframe removes the complexity of handling `SQL parsing`, `SQL rewriting`, `Binder/SQL Query Planner` etc. Once the dataframe is mature, we can easily integrate it with an SQL engine. ### Features @@ -11,7 +11,7 @@ the dataframe is mature, we can easily integrate it with an SQL engine. - Support `Parquet` reading with schema inference - `Rule Based` Optimizer - `AggFunc`: Sum -- `BooleanBinaryExpr`: Eq +- `BooleanBinaryExpr`: Lt ### Example diff --git a/cmd/simple/README.md b/cmd/simple/README.md new file mode 100644 index 0000000..02d9633 --- /dev/null +++ b/cmd/simple/README.md @@ -0,0 +1,34 @@ +### Data Generation + +```python +import pandas as pd +import pyarrow as pa +import pyarrow.parquet as pq + +# Creating a DataFrame with specified data +df = pd.DataFrame({ + 'c1': [100, 100, 100, 200,200, 300], + 'c2': [101, 201, 301, 401, 501, 601], + 'c3': [102, 202, 302, 402, 502, 602] +}, dtype='int64') + +# Convert the DataFrame to a PyArrow Table +table = pa.Table.from_pandas(df) + +# Save the table as a Parquet file +pq.write_table(table, 'c1_c2_c3_int64.parquet') +``` +### Data Reading + +```python +import pyarrow.parquet as pq + +# Read the Parquet file +table_read = pq.read_table('sample_int32.parquet') + +# Convert to a pandas DataFrame +df_read = table_read.to_pandas() + +# Display the DataFrame +print(df_read.head()) +``` \ No newline at end of file