#53: Create a deployment script #54

benedeki · 2023-09-18T15:02:51Z

deployment script
README.md for the deployment script

Closes #53

* deployment script

github-actions · 2023-09-18T15:04:54Z

JaCoCo code coverage report - scala 2.11.12

There is no coverage information present for the Files changed

Total Project Coverage	48.81%	🍏

github-actions · 2023-09-18T15:05:00Z

JaCoCo code coverage report - scala 2.12.17

There is no coverage information present for the Files changed

Total Project Coverage	48.96%	🍏

scripts/deployment/deploy.py

scripts/deployment/README.md

scripts/deployment/deploy.py

scripts/deployment/README.md

Co-authored-by: Ladislav Sulak <[email protected]>

miroslavpojer

pull
code review
run

scripts/deployment/deploy.py

scripts/deployment/README.md

lsulak · 2023-09-20T10:00:41Z

scripts/deployment/deploy.py

+        elif input_file.endswith(".sql"):
+            functions.append(read_file(schema_dir + input_file))
+        elif input_file.endswith(".ddl"):
+            tables.append(read_file(schema_dir + input_file))


potential optimization could be to only get the file names in this function, and then the loading of their content could be done in the same way as on line "\n".join(map(read_file, init_sqls)), so it would be consistent and a bit more memory efficient

More consitent maybe.
More memory efficient why?

More memory efficient why?

because now it has to be hold in memory from this point until it get's garbage collected which is probably after some point of running the sql statements.

If read only in that join function, then the memory requirements are the same but it will be held in memory probably for shorter time. Not a big deal in this scenario I know

scripts/deployment/deploy.py

Co-authored-by: Ladislav Sulak <[email protected]>

Co-authored-by: Ladislav Sulak <[email protected]> Co-authored-by: miroslavpojer <[email protected]>

* renamed `deploy.py` to `deploy_pg.py` * added example fo the expected structure to the `README.md` * included `requirements.txt`

scripts/deployment/deploy_pg.py

lsulak

I've tried this script on AUL DB files, and it worked when. However, repetitive runs don't work - generally speaking there is CREATE TABLE X and not CREATE TABLE IF NOT EXISTS X - detecting changes on tables would be complicated for this script and its intended (temporary) use = so we won't be able to replace / update tables with this script but that's okay - besides, functions will be replaced, that's good.

But I feel that this script shouldn't fail and we should be able to replace the implementation of DB functions with it - but now the script fails saying that a table already exists - we could capture this exception psycopg2.errors.DuplicateTable maybe? Or even better, perhaps we could ignore Initializing the database... step and it could be driven by some CLI param.?

So I propose to have this:

CREATE DB - no by default, but can be enabled
INIT DB - no by default, but can be enabled - this would need one more CLI parameter
Populate schemas - yes by default, cannot be disabled

    parser.add_argument(
        "--init-db",
        action="store_true",
        help="initializes the target database (runs the scripts ending with '.ddl' that don't start with '00_'"
    )

lsulak · 2023-09-27T13:21:49Z

scripts/deployment/deploy_pg.py

+        return file.read()
+
+
+def process_dir(directory: str, conn_config: PostgresDBConn, create_db: bool) -> None:


i still think that this function is a bit too long / doing too much, I would definitely split the processing of those 3 file categories into 3 separate functions - but I don't insist since this is a temporary script, still very useful.

An obvious improvement would be also to check whether the CLI DB name and DB name from the SQL files are matching, or to have the SQL files as Jinja2 templates and insert all necessary values dynamically. But that would be probably too much for this script & now

You are exactly right in all what you write.
And also have good reasoning why not to do it. Hopefully we will have good deployment tool, after...

lsulak

code reviewed
pulled
built
ran on my machine against my local PG db with Ursa Unify DB objects

found a few issues and I would love one thing to implement (repetitive run support), otherwise I'm happy to approve

Co-authored-by: Ladislav Sulak <[email protected]>

miroslavpojer

tested

lsulak · 2023-11-12T08:22:57Z

I've tried this script on AUL DB files, and it worked when. However, repetitive runs don't work - generally speaking there is CREATE TABLE X and not CREATE TABLE IF NOT EXISTS X - detecting changes on tables would be complicated for this script and its intended (temporary) use = so we won't be able to replace / update tables with this script but that's okay - besides, functions will be replaced, that's good.

But I feel that this script shouldn't fail and we should be able to replace the implementation of DB functions with it - but now the script fails saying that a table already exists - we could capture this exception psycopg2.errors.DuplicateTable maybe? Or even better, perhaps we could ignore Initializing the database... step and it could be driven by some CLI param.?

So I propose to have this:

CREATE DB - no by default, but can be enabled

INIT DB - no by default, but can be enabled - this would need one more CLI parameter

Populate schemas - yes by default, cannot be disabled
    parser.add_argument(
        "--init-db",
        action="store_true",
        help="initializes the target database (runs the scripts ending with '.ddl' that don't start with '00_'"
    )

From my side, I think this is the only thing that could be implemented, otherwise LGTM, I'm even gonna use it soon in Atum Service.

lsulak · 2025-10-22T15:58:42Z

Not relevant anymore.

#53: Create a deployment script

048819b

* deployment script

benedeki added the work in progress Work on this item is not yet finished (mainly intended for PRs) label Sep 18, 2023

* added README.md

18fa25d

benedeki marked this pull request as ready for review September 18, 2023 15:51

benedeki removed the work in progress Work on this item is not yet finished (mainly intended for PRs) label Sep 18, 2023

benedeki self-assigned this Sep 18, 2023

benedeki requested review from dk1844, jakipatryk, lsulak and miroslavpojer September 18, 2023 15:52