Skip to content

Conversation

@SKairinos
Copy link
Contributor

@SKairinos SKairinos commented Nov 20, 2025

This change is Reviewable

@SKairinos SKairinos linked an issue Nov 20, 2025 that may be closed by this pull request
Copy link
Contributor

@faucomte97 faucomte97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@faucomte97 reviewed 18 of 18 files at r1, all commit messages.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @SKairinos)


.gcloud/functions/load_data_into_bigquery/utils/logging.py line 39 at r1 (raw file):

        log_obj.update(context)

        # If the log call passed extra={"foo": "bar"}, add that too

Is this comment up to date?


.gcloud/functions/load_data_into_bigquery/utils/storage.py line 70 at r1 (raw file):

    @processed_status.setter
    def processed_status(self, value: _ProcessedStatus):
        """Moves the blob to the failed subdirectory for manual inspection."""

Update this docstring to reflect new approach


.gcloud/functions/load_data_into_bigquery/utils/chunk.py line 40 at r1 (raw file):

    timestamp: datetime  # when the data export began
    obj_i_start: int  # object index span start
    obj_i_end: int  # object index span end

I would say these comments aren't needed here as they've all been explained already in the docstring.
If you'd rather keep them here that's fine too, I'm not strongly for or against either.

Code quote:

    bq_table_name: str  # name of BigQuery table
    bq_table_write_mode: _BqTableWriteMode  # write mode for BigQuery table
    timestamp: datetime  # when the data export began
    obj_i_start: int  # object index span start
    obj_i_end: int  # object index span end

.gcloud/functions/load_data_into_bigquery/utils/chunk.py line 111 at r1 (raw file):

            )
        # "2025-01-01_00:00:00__1_1000"
        file_name = file_name.removesuffix(file_name_suffix)

Do this before doing any of the splitting.
It's marginal but will save on some processing if the file is the wrong format (which we can check without needing to do any splitting).

Code quote:

        file_name_suffix = ".csv"
        if not file_name.endswith(file_name_suffix):
            return handle_error(
                f'File name should end with "{file_name_suffix}".'
            )
        # "2025-01-01_00:00:00__1_1000"
        file_name = file_name.removesuffix(file_name_suffix)

.gcloud/functions/load_data_into_bigquery/utils/bigquery.py line 37 at r1 (raw file):

    Returns:
        A flag designating whether the flag was successfully processed. False

whether the blob* was successfully processed


.gcloud/functions/load_data_into_bigquery/utils/bigquery.py line 39 at r1 (raw file):

        A flag designating whether the flag was successfully processed. False
        will be returned if a known error occurred which makes it impossible to
        load the data (e.g. the BQ table does not exist) to avoid pointlessly

pointless*

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Import CSVs from GCS

3 participants