-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Out of Sync Explorer Tables + Crashes #581
Comments
Interesting that the error is coming from a "LegacyCursorResult" object, I wonder if sqlalchemy has changed |
That's what it sounds like to me. Or ODC has changed... @Ariana-B or @SpacemanPaul: I know that there's been work on updating to SQLAlchemy 2 in ODC, Is that only in ODC 1.9+ or was it in 1.8 as well? |
I think this might even be a SQLAlchemy 1.4 thing (which I believe explorer does use): https://stackoverflow.com/a/58660606
|
I agree that force-refresh is the correct move here — it should replace everything Explorer knows about the product. Will look into the crash, but to answer the other Qs:
The issue is that ODC's tables let us efficiently see recently added/updated/archived rows in the ODC schema, but not deleted rows (as they disappear completely). To handle deletions efficiently (ie, incrementally, without scanning the whole table) I believe we'd need to extend ODC's tables to add a "deleted row log" table that's populated via a trigger. Explorer could do that if we really want it. The "ideal" way to do it at the moment without code changes is to archive the datasets first, run |
It does, though it's a broad brush to use. The raw product count is one part of ~three. A more thorough query could be put on the audit pages of Explorer. Checking, for instance, the spatial table counts (which should have deletions when a dataset is removed/archived), the combined year counts, and how recent the summary is compared to most-recently-changed datasets in underlying ODC. |
SQLAlchemy 2 is only in datacube-1.9. 1.8 should still be on 1.4. But the later versions 1.4.x versions have a lot of changes to help users prepare for the migration to 2.0 - might be one of those. |
I don't know if it is related to this, but after fixing the missing lxml_html_clean dependency (#584), many tests on current HEAD of the develop branch also crash with a TypeError on LegacyCursorResult:
|
Had a second look, #580 (comment) isn't obvious to me. |
Good catch @pjonsson! That auto-modernise does break it. Here's a simple reproduction that throws the same exception: from sqlalchemy import create_engine, select, func
from collections import Counter
engine = create_engine('sqlite:///:memory:')
result2 = Counter(
dict(
engine.execute(
select(
[
select(1).label("item"),
select(2).label("count")
]
)
)
)
) But that's not the only bad part of the auto-modernise — it's also changing the They have different semantics as the
|
This might be a good excuse to move from shed over to ruff, as we have with other repositories. Formatting is fine and we don't need the "modernisation" aspects. |
This reverts commit e0aa69f to do issues detected in opendatacube#581
Based on issues in opendatacube#581
Reverting those changes in #587 |
I also just switched to Ruff from black/isort/flake8 and have good experiences so far. I don't know anything about shed, but I know Ruff lint has a modernisation category that can upgrade various things because it was eager to switch from |
While managing the DEA ODC Databases, we've had to do some manual deletions and updates to the
agdc
database tables, and I suspect this has made Explorer unhappy.We've used a 'QA' SQL query in the past, which compares the number of datasets explorer knows about vs the number of datasets ODC knows about.
Which shows a bunch of mismatched Products:
In an attempt to update this, I thought i could run
cubedash-gen --force-refresh
on each of the out of sync products, however when I try this, i get a crash:This did seem to succeed in getting the dataset counts in sync, but I'm not sure what state it has left the database in.
Questions
Versions
Docker Image:
docker.io/opendatacube/explorer@sha256:026ac9f06ad41abdf62dd5c8dc3206f6595bc25564f029bbcec574e64806d317
Explorer version: 2.10.2.dev20+ge0aa69f3
Core version: 1.8.17
Python: 3.10.12
SQLAlchemy Version: 1.4.52
The text was updated successfully, but these errors were encountered: