Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

setup test case for dataset common fields #432

Merged
merged 6 commits into from
Sep 19, 2022
Merged

setup test case for dataset common fields #432

merged 6 commits into from
Sep 19, 2022

Conversation

pindge
Copy link
Collaborator

@pindge pindge commented Aug 31, 2022

Fixes: #433 and #431

Scope

  • rework logic for datasets sampling as tablesample logic can return zero rows and its much more expensive to run compared with order by random
  • add test case for different sampling percentage

UI Changes

  • add heading for common fields and hover over information
    image
    image

Performance Comparison

odc=> select count(*) from agdc.dataset tablesample system (0.2) where dataset_type_ref = 1;
 count 
-------
    46
(1 row)

odc=> EXPLAIN select count(*) from agdc.dataset tablesample system (0.2) where dataset_type_ref = 1;
                             QUERY PLAN                              
---------------------------------------------------------------------
 Aggregate  (cost=29393.86..29393.87 rows=1 width=8)
   ->  Sample Scan on dataset  (cost=0.00..29393.81 rows=19 width=0)
         Sampling: system ('0.2'::real)
         Filter: (dataset_type_ref = 1)
(4 rows)

odc=> \q^C
odc=> EXPLAIN select id from agdc.dataset where dataset_type_ref  =1 order by random() limit 46;
                                                     QUERY PLAN                                                     
--------------------------------------------------------------------------------------------------------------------
 Limit  (cost=24471.98..24472.09 rows=46 width=24)
   ->  Sort  (cost=24471.98..24495.13 rows=9260 width=24)
         Sort Key: (random())
         ->  Index Scan using ix_agdc_dataset_dataset_type_ref on dataset  (cost=0.43..24169.94 rows=9260 width=24)
               Index Cond: (dataset_type_ref = 1)
(5 rows)

@codecov
Copy link

codecov bot commented Aug 31, 2022

Codecov Report

Base: 86.95% // Head: 86.95% // Increases project coverage by +0.00% 🎉

Coverage data is based on head (ebc1223) compared to base (f6b233c).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@           Coverage Diff            @@
##           develop     #432   +/-   ##
========================================
  Coverage    86.95%   86.95%           
========================================
  Files           25       25           
  Lines         3165     3158    -7     
========================================
- Hits          2752     2746    -6     
+ Misses         413      412    -1     
Impacted Files Coverage Δ
cubedash/summary/_stores.py 92.15% <100.00%> (+0.08%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@pindge pindge requested a review from jeremyh August 31, 2022 03:16
@pindge pindge added UI/UX User interface issues enhancement labels Aug 31, 2022
@pindge pindge marked this pull request as ready for review August 31, 2022 03:20
@pindge pindge requested review from omad and SpacemanPaul September 18, 2022 23:37
Copy link
Collaborator

@jeremyh jeremyh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks Pin!

We could also squash those two queries back into one query to keep latency smaller, but this is a good improvement as-is.

@pindge pindge merged commit 7079998 into develop Sep 19, 2022
@delete-merged-branch delete-merged-branch bot deleted the maturity branch September 19, 2022 02:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement UI/UX User interface issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect logic for sample datasets used for fixed_metadata
2 participants