Load data into BQ #1045

SKairinos · 2025-11-20T16:07:31Z

This change is

faucomte97

@faucomte97 reviewed 18 of 18 files at r1, all commit messages.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @SKairinos)

.gcloud/functions/load_data_into_bigquery/utils/logging.py line 39 at r1 (raw file):

        log_obj.update(context)

        # If the log call passed extra={"foo": "bar"}, add that too

Is this comment up to date?

.gcloud/functions/load_data_into_bigquery/utils/storage.py line 70 at r1 (raw file):

    @processed_status.setter
    def processed_status(self, value: _ProcessedStatus):
        """Moves the blob to the failed subdirectory for manual inspection."""

Update this docstring to reflect new approach

.gcloud/functions/load_data_into_bigquery/utils/chunk.py line 40 at r1 (raw file):

    timestamp: datetime  # when the data export began
    obj_i_start: int  # object index span start
    obj_i_end: int  # object index span end

I would say these comments aren't needed here as they've all been explained already in the docstring.
If you'd rather keep them here that's fine too, I'm not strongly for or against either.

Code quote:

    bq_table_name: str  # name of BigQuery table
    bq_table_write_mode: _BqTableWriteMode  # write mode for BigQuery table
    timestamp: datetime  # when the data export began
    obj_i_start: int  # object index span start
    obj_i_end: int  # object index span end

.gcloud/functions/load_data_into_bigquery/utils/chunk.py line 111 at r1 (raw file):

            )
        # "2025-01-01_00:00:00__1_1000"
        file_name = file_name.removesuffix(file_name_suffix)

Do this before doing any of the splitting.
It's marginal but will save on some processing if the file is the wrong format (which we can check without needing to do any splitting).

Code quote:

        file_name_suffix = ".csv"
        if not file_name.endswith(file_name_suffix):
            return handle_error(
                f'File name should end with "{file_name_suffix}".'
            )
        # "2025-01-01_00:00:00__1_1000"
        file_name = file_name.removesuffix(file_name_suffix)

.gcloud/functions/load_data_into_bigquery/utils/bigquery.py line 37 at r1 (raw file):

    Returns:
        A flag designating whether the flag was successfully processed. False

whether the blob* was successfully processed

.gcloud/functions/load_data_into_bigquery/utils/bigquery.py line 39 at r1 (raw file):

        A flag designating whether the flag was successfully processed. False
        will be returned if a known error occurred which makes it impossible to
        load the data (e.g. the BQ table does not exist) to avoid pointlessly

pointless*

SKairinos added 30 commits November 10, 2025 09:48

quick save

1dbf7bd

test

b773a92

rename

a3e4ee3

Merge branch 'main' into workspace_236

fbbd20d

secret here

a40d4c2

fix

2244e41

fix

fc1ae41

test

70ed46f

test

49de285

fix

bec8439

test

11849fb

fix

666b99e

house keeping

1a3668c

gcloud setup

06290b3

only dirs

58093b0

permissions

d5d6023

bucket

fb73f56

Merge branch 'main' into workspace_236

caae54b

quick save

10b49a9

fix

137e772

quick save

b749b07

fix

07ade91

fix trigger

a1167d0

Install the GCloud CLI

722640b

fix trigger

199abe0

gcloud ignore

2f26ac6

serve all traffic to the latest revision

3e2b5d5

fix chunk tracking

ee50443

set firestore db id

659180f

better blob handling

d9cd768

SKairinos added 8 commits November 18, 2025 14:57

improvements

63b8715

house cleaning

764cab2

fix retry logic

c29fe20

fix logging

76a80f9

house keeping

a9279c1

fix event time parsing

ca286c3

fix duplicate logs

62911f9

fix bugs

7f9c400

SKairinos linked an issue Nov 20, 2025 that may be closed by this pull request

Import CSVs from GCS #236

Open

SKairinos added 7 commits November 20, 2025 16:19

replace deprecated intellisense with copilot chat

19fcec0

fix refs

c338e46

typo

63a0e0e

house keeping

e798dfa

url

5483f0d

house keeping

5c5cec8

house keeping

ca5f81b

faucomte97 requested changes Nov 21, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Load data into BQ #1045

Load data into BQ #1045

Uh oh!

SKairinos commented Nov 20, 2025 •

edited by faucomte97

Loading

Uh oh!

faucomte97 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Load data into BQ #1045

Are you sure you want to change the base?

Load data into BQ #1045

Uh oh!

Conversation

SKairinos commented Nov 20, 2025 • edited by faucomte97 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

faucomte97 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

SKairinos commented Nov 20, 2025 •

edited by faucomte97

Loading