Stateless PG extension #3122

khustup2 · 2026-01-22T17:13:50Z

🚀 🚀 Pull Request

Impact

Bug fix (non-breaking change which fixes expected existing functionality)
Enhancement/New feature (adds functionality without impacting existing logic)
Breaking change (fix or feature that would cause existing functionality to change)

Description

Things to be aware of

Things to worry about

Additional Context

claude · 2026-01-22T17:14:17Z

Claude finished @khustup2's task in 2m 7s —— View job

PR Review Complete

Reviewed: Stateless PG extension changes

Tasks

Fetch PR details and diff
Analyze C++ code changes for bugs/security/correctness
Review test coverage
Post inline comments for issues found
Finalize review

Issues Found

I've identified 8 significant issues in this PR that need attention:

Critical (Potential Crashes):

Null pointer dereferences in catalog operations (3 locations)
Missing error handling for auto-created tables leading to catalog inconsistency

Correctness:

Catalog state inconsistency when table drop fails
Memory leak in options list processing
Double I/O operations in catalog version management

Performance:

Inefficient catalog access patterns (fetching version immediately after bumping)

All issues have been posted as inline comments with specific recommendations. The test coverage looks comprehensive with multi-instance scenarios well tested.

claude · 2026-01-22T17:15:50Z

cpp/deeplake_pg/dl_catalog.cpp

+    meta_schema.add("catalog_version", deeplake_core::type::generic(nd::type::scalar(nd::dtype::int64)))
+        .add("updated_at", deeplake_core::type::generic(nd::type::scalar(nd::dtype::int64)))
+        .set_primary_key("catalog_version");
+    auto meta_table = ensure_table(meta_path, std::move(meta_schema));


Potential null pointer dereference: ensure_table can return empty shared_ptr (line 144), but no null check before calling ->read(). This will crash if catalog table creation fails.
Add null check: if (!meta_table) { elog(ERROR, "Failed to create meta catalog table"); }

claude · 2026-01-22T17:15:51Z

cpp/deeplake_pg/dl_catalog.cpp

+void bump_catalog_version(const std::string& root_path, icm::string_map<> creds)
+{
+    auto table = open_catalog_table(root_path, k_meta_name, std::move(creds));
+    int64_t version = get_catalog_version(root_path, creds);


Double catalog access in bump_catalog_version: calls get_catalog_version (which reads catalog), then calls open_catalog_table again. This causes unnecessary I/O and potential race condition in multi-instance scenarios.
Refactor to reuse opened table: pass version as parameter or return table handle from get_catalog_version.

claude · 2026-01-22T17:15:53Z

cpp/deeplake_pg/dl_catalog.cpp

+
+void upsert_table(const std::string& root_path, icm::string_map<> creds, const table_meta& meta)
+{
+    auto table = open_catalog_table(root_path, k_tables_name, std::move(creds));


No null check before dereferencing: open_catalog_table can throw/return null, but result is used without validation. If catalog is corrupted or inaccessible, this will crash.
Add: if (!table) { elog(ERROR, "Failed to open catalog table"); }

claude · 2026-01-22T17:15:54Z

cpp/deeplake_pg/table_storage.cpp

+    meta.dataset_path = ds_path;
+    meta.state = "ready";
+    pg::dl_catalog::upsert_table(root_dir, std::move(creds), meta);
+    pg::dl_catalog::bump_catalog_version(root_dir, session_credentials::get_credentials());


Inefficient catalog access pattern: calls bump_catalog_version then immediately calls get_catalog_version. The bump already knows the new version (old + 1), so fetching it again wastes I/O.
Cache the new version: catalog_version_ = version + 1; instead of re-fetching.

claude · 2026-01-22T17:15:56Z

cpp/deeplake_pg/table_storage.cpp

+                StringInfoData buf;
+                initStringInfo(&buf);
+                appendStringInfo(&buf, "CREATE SCHEMA IF NOT EXISTS %s", qschema);
+                SPI_execute(buf.data, false, 0);


Missing error handling: SPI_execute for CREATE SCHEMA returns value is ignored. If schema creation fails silently, the subsequent table creation will fail with confusing error.
Check result: if (SPI_execute(...) < 0) { elog(ERROR, "Failed to create schema %s", qschema); }

claude · 2026-01-22T17:15:57Z

cpp/deeplake_pg/table_storage.cpp

+                appendStringInfo(&buf,
+                                 "SELECT create_deeplake_table(%s, %s)",
+                                 quote_literal_cstr(qualified_name.c_str()),
+                                 qpath);


Silent failure on table auto-creation: If create_deeplake_table fails (line 299), only a WARNING is logged (line 637), but execution continues. This leaves the catalog in inconsistent state - table exists in catalog but not in PG.
Either propagate error or mark table as "failed" state in catalog for cleanup/retry.

claude · 2026-01-22T17:16:13Z

cpp/deeplake_pg/extension_init.cpp

            ListCell* lc = nullptr;
            foreach (lc, stmt->options) {
                DefElem* def = (DefElem*)lfirst(lc);
                if (def->arg != nullptr && std::strcmp(def->defname, pg::dataset_path_option_name) == 0) {


Memory leak in options processing: When dataset_path option is found and processed, it's removed from the list via continue (line 623) without being freed. The loop rebuilds new_options excluding this element, but the DefElem node is never deallocated.
PostgreSQL list nodes need explicit cleanup. Consider using pfree(def) before continue or keeping the node if it should be preserved.

claude · 2026-01-22T17:16:21Z

cpp/deeplake_pg/table_storage.cpp

+        meta.schema_name = schema_name;
+        meta.table_name = simple_table_name;
+        meta.dataset_path = table_data.get_dataset_path().url();
+        meta.state = "dropping";


Catalog inconsistency on drop failure: If dataset deletion fails after catalog is marked "dropping", the table becomes invisible to all instances (filtered in load_tables) but isn't actually deleted. No recovery mechanism exists.
Add error handling: revert to "ready" on failure, or implement cleanup job for stuck "dropping" entries.

…teless

sonarqubecloud · 2026-01-25T11:47:46Z

Quality Gate failed

Failed conditions
2 Security Hotspots

See analysis details on SonarQube Cloud

khustup2 added 4 commits January 21, 2026 08:12

Prepare stateless PG extension.

0fae7d3

Fixed crash in autovacuum.

5c91114

Added stateless test.

5db7274

Merge remote-tracking branch 'origin/main' into stateless

53611f3

claude bot reviewed Jan 22, 2026

View reviewed changes

khustup2 added 3 commits January 22, 2026 18:23

Cleanup tests.

257ea79

Add deeplake api version.

fba2b1b

Merge branch 'stateless' of github.com:activeloopai/deeplake into sta…

f75f3a4

…teless

khustup2 force-pushed the stateless branch from 9219be5 to 995e633 Compare January 23, 2026 21:27

Fixed statelessness.

b4a175f

khustup2 force-pushed the stateless branch from 995e633 to b4a175f Compare January 23, 2026 22:21

Implement stateless behavior with pg background workers.

3c1b4af

khustup2 force-pushed the stateless branch from 28070c8 to 3c1b4af Compare January 24, 2026 01:00

khustup2 added 3 commits January 24, 2026 17:27

Fixed warnings in sync_worker.

eb72410

Run tpch tests.

49297ef

Fast exit in autovacuum process.

1405a2d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stateless PG extension #3122

Stateless PG extension #3122

Uh oh!

khustup2 commented Jan 22, 2026

Uh oh!

claude bot commented Jan 22, 2026 •

edited

Loading

Uh oh!

claude bot Jan 22, 2026

Uh oh!

claude bot Jan 22, 2026

Uh oh!

claude bot Jan 22, 2026

Uh oh!

claude bot Jan 22, 2026

Uh oh!

claude bot Jan 22, 2026

Uh oh!

claude bot Jan 22, 2026

Uh oh!

claude bot Jan 22, 2026

Uh oh!

claude bot Jan 22, 2026

Uh oh!

sonarqubecloud bot commented Jan 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Stateless PG extension #3122

Are you sure you want to change the base?

Stateless PG extension #3122

Uh oh!

Conversation

khustup2 commented Jan 22, 2026

🚀 🚀 Pull Request

Impact

Description

Things to be aware of

Things to worry about

Additional Context

Uh oh!

claude bot commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Complete

Tasks

Issues Found

Uh oh!

claude bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud bot commented Jan 25, 2026

Quality Gate failed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

claude bot commented Jan 22, 2026 •

edited

Loading