-
Notifications
You must be signed in to change notification settings - Fork 704
Stateless PG extension #3122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Stateless PG extension #3122
Conversation
|
Claude finished @khustup2's task in 2m 7s —— View job PR Review CompleteReviewed: Stateless PG extension changes Tasks
Issues FoundI've identified 8 significant issues in this PR that need attention: Critical (Potential Crashes):
Correctness:
Performance:
All issues have been posted as inline comments with specific recommendations. The test coverage looks comprehensive with multi-instance scenarios well tested. |
| meta_schema.add("catalog_version", deeplake_core::type::generic(nd::type::scalar(nd::dtype::int64))) | ||
| .add("updated_at", deeplake_core::type::generic(nd::type::scalar(nd::dtype::int64))) | ||
| .set_primary_key("catalog_version"); | ||
| auto meta_table = ensure_table(meta_path, std::move(meta_schema)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential null pointer dereference: ensure_table can return empty shared_ptr (line 144), but no null check before calling ->read(). This will crash if catalog table creation fails.
Add null check: if (!meta_table) { elog(ERROR, "Failed to create meta catalog table"); }
| void bump_catalog_version(const std::string& root_path, icm::string_map<> creds) | ||
| { | ||
| auto table = open_catalog_table(root_path, k_meta_name, std::move(creds)); | ||
| int64_t version = get_catalog_version(root_path, creds); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double catalog access in bump_catalog_version: calls get_catalog_version (which reads catalog), then calls open_catalog_table again. This causes unnecessary I/O and potential race condition in multi-instance scenarios.
Refactor to reuse opened table: pass version as parameter or return table handle from get_catalog_version.
|
|
||
| void upsert_table(const std::string& root_path, icm::string_map<> creds, const table_meta& meta) | ||
| { | ||
| auto table = open_catalog_table(root_path, k_tables_name, std::move(creds)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No null check before dereferencing: open_catalog_table can throw/return null, but result is used without validation. If catalog is corrupted or inaccessible, this will crash.
Add: if (!table) { elog(ERROR, "Failed to open catalog table"); }
| meta.dataset_path = ds_path; | ||
| meta.state = "ready"; | ||
| pg::dl_catalog::upsert_table(root_dir, std::move(creds), meta); | ||
| pg::dl_catalog::bump_catalog_version(root_dir, session_credentials::get_credentials()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inefficient catalog access pattern: calls bump_catalog_version then immediately calls get_catalog_version. The bump already knows the new version (old + 1), so fetching it again wastes I/O.
Cache the new version: catalog_version_ = version + 1; instead of re-fetching.
| StringInfoData buf; | ||
| initStringInfo(&buf); | ||
| appendStringInfo(&buf, "CREATE SCHEMA IF NOT EXISTS %s", qschema); | ||
| SPI_execute(buf.data, false, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing error handling: SPI_execute for CREATE SCHEMA returns value is ignored. If schema creation fails silently, the subsequent table creation will fail with confusing error.
Check result: if (SPI_execute(...) < 0) { elog(ERROR, "Failed to create schema %s", qschema); }
cpp/deeplake_pg/table_storage.cpp
Outdated
| appendStringInfo(&buf, | ||
| "SELECT create_deeplake_table(%s, %s)", | ||
| quote_literal_cstr(qualified_name.c_str()), | ||
| qpath); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Silent failure on table auto-creation: If create_deeplake_table fails (line 299), only a WARNING is logged (line 637), but execution continues. This leaves the catalog in inconsistent state - table exists in catalog but not in PG.
Either propagate error or mark table as "failed" state in catalog for cleanup/retry.
| ListCell* lc = nullptr; | ||
| foreach (lc, stmt->options) { | ||
| DefElem* def = (DefElem*)lfirst(lc); | ||
| if (def->arg != nullptr && std::strcmp(def->defname, pg::dataset_path_option_name) == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Memory leak in options processing: When dataset_path option is found and processed, it's removed from the list via continue (line 623) without being freed. The loop rebuilds new_options excluding this element, but the DefElem node is never deallocated.
PostgreSQL list nodes need explicit cleanup. Consider using pfree(def) before continue or keeping the node if it should be preserved.
| meta.schema_name = schema_name; | ||
| meta.table_name = simple_table_name; | ||
| meta.dataset_path = table_data.get_dataset_path().url(); | ||
| meta.state = "dropping"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Catalog inconsistency on drop failure: If dataset deletion fails after catalog is marked "dropping", the table becomes invisible to all instances (filtered in load_tables) but isn't actually deleted. No recovery mechanism exists.
Add error handling: revert to "ready" on failure, or implement cleanup job for stuck "dropping" entries.
|


🚀 🚀 Pull Request
Impact
Description
Things to be aware of
Things to worry about
Additional Context