Skip to content

Commit

Permalink
new methods
Browse files Browse the repository at this point in the history
  • Loading branch information
xando committed Jun 12, 2024
1 parent 298504d commit cc83008
Showing 1 changed file with 62 additions and 3 deletions.
65 changes: 62 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,18 @@
A simple library to **write to** and **download from** BigQuery tables as PyArrow tables.


---


## Installation

```bash
pip install pyarrow-bigquery
```


---

## Quick Start

This guide will help you quickly get started with `pyarrow-bigquery`, a library that allows you to **read** from and **write** to Google BigQuery using PyArrow.
Expand Down Expand Up @@ -74,8 +80,13 @@ with bq.writer("gcp_project.dataset.table", schema=schema) as w:
w.write_table(table)
```


---

## API Reference

### Writing

#### `pyarrow.bigquery.write_table`

Write a PyArrow Table to a BigQuery Table. No return value.
Expand Down Expand Up @@ -113,7 +124,7 @@ Write a PyArrow Table to a BigQuery Table. No return value.
bq.write_table(table, 'gcp_project.dataset.table')
```

#### `pyarrow.bigquery.writer`
#### `pyarrow.bigquery.writer` (Context manager)

Context manager version of the write method. Useful when the PyArrow table is larger than memory size or the table is available in chunks.

Expand Down Expand Up @@ -148,7 +159,7 @@ Context manager version of the write method. Useful when the PyArrow table is la

Depending on the use case, you might want to use one of the methods below to write your data to a BigQuery table, using either `pa.Table` or `pa.RecordBatch`.

#### `pyarrow.bigquery.writer.write_table`
#### `pyarrow.bigquery.writer.write_table` (Context Manager Method)

Context manager method to write a table.

Expand All @@ -168,7 +179,7 @@ with bq.writer("gcp_project.dataset.table", schema=schema) as w:
w.write_table(pa.Table.from_pylist([{'value': [a] * 10}]))
```

#### `pyarrow.bigquery.writer.write_batch`
#### `pyarrow.bigquery.writer.write_batch` (Context Manager Method)

Context manager method to write a record batch.

Expand All @@ -188,6 +199,8 @@ with bq.writer("gcp_project.dataset.table", schema=schema) as w:
w.write_batch(pa.RecordBatch.from_pylist([{'value': [1] * 10}]))
```

### Reading

#### `pyarrow.bigquery.read_table`

**Parameters:**
Expand All @@ -213,6 +226,28 @@ with bq.writer("gcp_project.dataset.table", schema=schema) as w:
- `batch_size`: `int`, *default* `100`
Batch size used for fetching. Table will be automatically split to this value.

#### `pyarrow.bigquery.read_query`

**Parameters:**
- `project`: `str`
BigQuery query execution (and billing project).

- `query`: `str`
Query to be executed

- `worker_type`: `threading.Thread | multiprocessing.Process`, *default* `threading.Thread`
Worker backend for fetching data.

- `worker_count`: `int`, *default* `os.cpu_count()`
Number of threads or processes to use for fetching data from BigQuery.

- `batch_size`: `int`, *default* `100`
Batch size used for fetching. Table will be automatically split to this value.

```python
table = bq.read_query("gcp_project", "SELECT * FROM `gcp_project.dataset.table`")
```

#### `pyarrow.bigquery.reader`

**Parameters:**
Expand Down Expand Up @@ -248,3 +283,27 @@ for part in bq.reader("gcp_project.dataset.table"):

table = pa.concat_tables(parts)
```


#### `pyarrow.bigquery.reader_query`

**Parameters:**
- `project`: `str`
BigQuery query execution (and billing project).

- `query`: `str`
Query to be executed

- `worker_type`: `threading.Thread | multiprocessing.Process`, *default* `threading.Thread`
Worker backend for fetching data.

- `worker_count`: `int`, *default* `os.cpu_count()`
Number of threads or processes to use for fetching data from BigQuery.

- `batch_size`: `int`, *default* `100`
Batch size used for fetching. Table will be automatically split to this value.

```python
for batch in bq.reader_query("gcp_project", "SELECT * FROM `gcp_project.dataset.table`"):
print(batch.num_rows)
```

0 comments on commit cc83008

Please sign in to comment.