Out of Sync Explorer Tables + Crashes #581

omad · 2024-04-05T04:50:26Z

While managing the DEA ODC Databases, we've had to do some manual deletions and updates to the agdc database tables, and I suspect this has made Explorer unhappy.

We've used a 'QA' SQL query in the past, which compares the number of datasets explorer knows about vs the number of datasets ODC knows about.

SELECT *, (OUTPUT_QUERY.agdc_ds_count - OUTPUT_QUERY.explorer_ds_count) as diff
FROM (
        select count(ds.id) as agdc_ds_count, p.dataset_count as explorer_ds_count, dt.id, p.name 
        from cubedash.product p, agdc.dataset_type dt, agdc.dataset ds 
        where 
            dt.name = p.name 
        and dt.id = ds.dataset_type_ref 
        and ds.archived is NULL 
        GROUP BY p.name, dt.id, p.dataset_count) AS OUTPUT_QUERY
WHERE NOT OUTPUT_QUERY.agdc_ds_count = OUTPUT_QUERY.explorer_ds_count;

Which shows a bunch of mismatched Products:

 agdc_ds_count | explorer_ds_count |  id  |            name            |  diff   
---------------+-------------------+------+----------------------------+---------
        419777 |            514922 |  479 | ga_ls_fc_3                 |  -95145
        415373 |            417815 |  478 | ga_ls_wo_3                 |   -2442
         39227 |             88696 | 1271 | ga_ls_wo_fq_cyear_3        |  -49469
        503454 |            503453 | 1798 | ga_s2am_ard_3              |       1
         65063 |            184167 | 1006 | ga_s2_ba_provisional_3     | -119104
          8639 |             10050 | 2953 | ga_s2ls_intertidal_cyear_3 |   -1411
         51381 |            204223 | 1600 | ga_s2_wo_provisional_3     | -152842

In an attempt to update this, I thought i could run cubedash-gen --force-refresh on each of the out of sync products, however when I try this, i get a crash:

cubedash-gen --force-refresh ga_s2ls_intertidal_cyear_3
Updating 1 products for LocalConfig<loaded_from=defaults, environment='default', config={'db_hostname': 'dea-dev-db', 'db_database': 'odc', 'db_username': 'explorer_writer', 'db_password': '***', 'index_driver': 'default', 'db_connection_timeout': '60'}>
Generating product summaries...
ga_s2ls_intertidal_cyear_3 refresh
2024-04-05T04:27:02.288888Z [warning  ] spatial_deletion_full_scan     after_date=None product_name=ga_s2ls_intertidal_cyear_3
2024-04-05T04:27:02.326885Z [warning  ] spatial_update.recreating_everything after_date=None product_name=ga_s2ls_intertidal_cyear_3
2024-04-05T04:33:35.828477Z [warning  ] forcing_refresh                product_name=ga_s2ls_intertidal_cyear_3
	 ga_s2ls_intertidal_cyear_3 1900
2024-04-05T04:33:35.996976Z [error    ] product.error                  product=ga_s2ls_intertidal_cyear_3
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/cubedash/generate.py", line 127, in generate_report
    result, updated_summary = store.refresh(
  File "/usr/local/lib/python3.10/dist-packages/cubedash/summary/_stores.py", line 1528, in refresh
    self._recalculate_period(
  File "/usr/local/lib/python3.10/dist-packages/cubedash/summary/_stores.py", line 1374, in _recalculate_period
    summary = self._summariser.calculate_summary(
  File "/usr/local/lib/python3.10/dist-packages/cubedash/summary/_summarise.py", line 170, in calculate_summary
    dict(
TypeError: 'LegacyCursorResult' object is not subscriptable
ga_s2ls_intertidal_cyear_3 error
finished. 1 error

This did seem to succeed in getting the dataset counts in sync, but I'm not sure what state it has left the database in.

Questions

Is this crash happening after the Product is entirely updated, or is it only partially done?
Does this SQL query make any sense?
Should a 'QA' feature like this be included in Explorer?
What can we do to make Explorer resilient to Database modifications? Ideally we need to fix ODC to support mutating the database, but there's going to be a few assumptions about datasets not being allowed to be deleted that will have flow on impacts.

Versions

Docker Image: docker.io/opendatacube/explorer@sha256:026ac9f06ad41abdf62dd5c8dc3206f6595bc25564f029bbcec574e64806d317
Explorer version: 2.10.2.dev20+ge0aa69f3
Core version: 1.8.17
Python: 3.10.12
SQLAlchemy Version: 1.4.52

The text was updated successfully, but these errors were encountered:

jeremyh · 2024-04-05T05:07:39Z

Interesting that the error is coming from a "LegacyCursorResult" object, I wonder if sqlalchemy has changed

omad · 2024-04-05T05:14:01Z

That's what it sounds like to me. Or ODC has changed...

@Ariana-B or @SpacemanPaul: I know that there's been work on updating to SQLAlchemy 2 in ODC, Is that only in ODC 1.9+ or was it in 1.8 as well?

Ariana-B · 2024-04-05T05:25:52Z

I think this might even be a SQLAlchemy 1.4 thing (which I believe explorer does use): https://stackoverflow.com/a/58660606
And from the docs;

Within the scope of the 1.x series of SQLAlchemy, Core SQL results in version 1.4 return an instance of LegacyCursorResult which takes the place of the CursorResult class used for the 1.3 series and previously. This object returns rows as LegacyRow objects, which maintains Python mapping (i.e. dictionary) like behaviors upon the object itself. Going forward, the Row._mapping attribute should be used for dictionary behaviors.

jeremyh · 2024-04-05T05:29:43Z

I agree that force-refresh is the correct move here — it should replace everything Explorer knows about the product. Will look into the crash, but to answer the other Qs:

What can we do to make Explorer resilient to Database modifications? Ideally we need to fix ODC to support mutating the database, but there's going to be a few assumptions about datasets not being allowed to be deleted that will have flow on impacts.

The issue is that ODC's tables let us efficiently see recently added/updated/archived rows in the ODC schema, but not deleted rows (as they disappear completely).

To handle deletions efficiently (ie, incrementally, without scanning the whole table) I believe we'd need to extend ODC's tables to add a "deleted row log" table that's populated via a trigger. Explorer could do that if we really want it.

The "ideal" way to do it at the moment without code changes is to archive the datasets first, run cubedash-gen so it will see that those datasets as recently archived, and then we're free to delete them from ODC. This was never seen as a large priority as we used to say prod datasets would only ever be archived (so the provenance record remains), but that's idealistic :)

jeremyh · 2024-04-05T05:37:02Z

Does this SQL query make any sense?
Should a 'QA' feature like this be included in Explorer?

It does, though it's a broad brush to use. The raw product count is one part of ~three. A more thorough query could be put on the audit pages of Explorer. Checking, for instance, the spatial table counts (which should have deletions when a dataset is removed/archived), the combined year counts, and how recent the summary is compared to most-recently-changed datasets in underlying ODC.

SpacemanPaul · 2024-04-05T05:57:22Z

SQLAlchemy 2 is only in datacube-1.9.

1.8 should still be on 1.4. But the later versions 1.4.x versions have a lot of changes to help users prepare for the migration to 2.0 - might be one of those.

pjonsson · 2024-04-06T11:09:20Z

I don't know if it is related to this, but after fixing the missing lxml_html_clean dependency (#584), many tests on current HEAD of the develop branch also crash with a TypeError on LegacyCursorResult:

...
            region_counts = Counter(
>               dict(
                    self._engine.execute(
                        select(
                            [
                                DATASET_SPATIAL.c.region_code.label("region_code"),
                                func.count(),
                            ]
                        )
                        .where(where_clause)
                        .group_by("region_code")
                    )
                )
            )
E           TypeError: 'LegacyCursorResult' object is not subscriptable
cubedash/summary/_summarise.py:170: TypeError

pjonsson · 2024-04-06T11:20:05Z

The LegacyCursorResult type errors were first seen on develop after @omad merged #580 on 18 March, and it looks like commit e0aa69f introduced the error (commit a6c1079 just before was also red, but the integration tests passed for that).

pjonsson · 2024-04-06T22:08:20Z

Had a second look, #580 (comment) isn't obvious to me.

jeremyh · 2024-04-07T23:17:28Z

The LegacyCursorResult type errors were first seen on develop after @omad merged #580 on 18 March, and it looks like commit e0aa69f introduced the error (commit a6c1079 just before was also red, but the integration tests passed for that).

Good catch @pjonsson!

That auto-modernise does break it. Here's a simple reproduction that throws the same exception:

    from sqlalchemy import create_engine, select, func
    from collections import Counter

    engine = create_engine('sqlite:///:memory:')
    result2 = Counter(
        dict(
            engine.execute(
                select(
                    [
                        select(1).label("item"),
                        select(2).label("count")
                    ]
                )
            )
        )
    )

But that's not the only bad part of the auto-modernise — it's also changing the == None lines in sqlalchemy queries to is None.

They have different semantics as the is operator could never be overridden to be part of sqlalchemy's API (but __eq__ could). This could drastically change some logic.

(On newer sqlalchemy there's also a more-explict .is_(None)), which didn't exist when Explorer was first written.)

jeremyh · 2024-04-07T23:21:39Z

This might be a good excuse to move from shed over to ruff, as we have with other repositories. Formatting is fine and we don't need the "modernisation" aspects.

This reverts commit e0aa69f to do issues detected in opendatacube#581

Based on issues in opendatacube#581

jeremyh · 2024-04-08T00:15:25Z

Reverting those changes in #587

This reverts commit e0aa69f to do issues detected in #581

Based on issues in #581

pjonsson · 2024-04-08T07:44:55Z

I also just switched to Ruff from black/isort/flake8 and have good experiences so far.

I don't know anything about shed, but I know Ruff lint has a modernisation category that can upgrade various things because it was eager to switch from Optional[t] to t | None on Python 3.10.

omad assigned jeremyh Apr 5, 2024

omad added bug help wanted question labels Apr 5, 2024

omad mentioned this issue Apr 5, 2024

Unexpected Database Load running cubedash-gen #582

Open

jeremyh added a commit to jeremyh/datacube-explorer that referenced this issue Apr 8, 2024

Revert "[pre-commit.ci] auto fixes from pre-commit.com hooks"

fae3121

This reverts commit e0aa69f to do issues detected in opendatacube#581

jeremyh added a commit to jeremyh/datacube-explorer that referenced this issue Apr 8, 2024

Replace shed with ruff

ed70628

Based on issues in opendatacube#581

jeremyh mentioned this issue Apr 8, 2024

Revert previous auto-modernise commits #587

Merged

omad pushed a commit that referenced this issue Apr 8, 2024

Revert "[pre-commit.ci] auto fixes from pre-commit.com hooks"

5de3182

This reverts commit e0aa69f to do issues detected in #581

omad pushed a commit that referenced this issue Apr 8, 2024

Replace shed with ruff

b3a4798

Based on issues in #581

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Out of Sync Explorer Tables + Crashes #581

Out of Sync Explorer Tables + Crashes #581

omad commented Apr 5, 2024

jeremyh commented Apr 5, 2024

omad commented Apr 5, 2024

Ariana-B commented Apr 5, 2024 •

edited

Loading

jeremyh commented Apr 5, 2024 •

edited

Loading

jeremyh commented Apr 5, 2024 •

edited

Loading

SpacemanPaul commented Apr 5, 2024

pjonsson commented Apr 6, 2024

pjonsson commented Apr 6, 2024

pjonsson commented Apr 6, 2024

jeremyh commented Apr 7, 2024 •

edited

Loading

jeremyh commented Apr 7, 2024 •

edited

Loading

jeremyh commented Apr 8, 2024

pjonsson commented Apr 8, 2024

Out of Sync Explorer Tables + Crashes #581

Out of Sync Explorer Tables + Crashes #581

Comments

omad commented Apr 5, 2024

Questions

Versions

jeremyh commented Apr 5, 2024

omad commented Apr 5, 2024

Ariana-B commented Apr 5, 2024 • edited Loading

jeremyh commented Apr 5, 2024 • edited Loading

jeremyh commented Apr 5, 2024 • edited Loading

SpacemanPaul commented Apr 5, 2024

pjonsson commented Apr 6, 2024

pjonsson commented Apr 6, 2024

pjonsson commented Apr 6, 2024

jeremyh commented Apr 7, 2024 • edited Loading

jeremyh commented Apr 7, 2024 • edited Loading

jeremyh commented Apr 8, 2024

pjonsson commented Apr 8, 2024

Ariana-B commented Apr 5, 2024 •

edited

Loading

jeremyh commented Apr 5, 2024 •

edited

Loading

jeremyh commented Apr 5, 2024 •

edited

Loading

jeremyh commented Apr 7, 2024 •

edited

Loading

jeremyh commented Apr 7, 2024 •

edited

Loading