You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have a postgresql table with about 28 million facts with a financial_year column. Users can use the babbage API to essentially query the distinct financial_year values, which is about 10 unique values.
Postgresql seems to be very naive when doing SELECT DISTINCT financial_year FROM table because it runs a table scan even though financial_year has an index, which takes 60+ seconds. This seems to be a known problem with postgresql.
How have others solved this problem? Do we split out the financial_year data (and all the other dimensions of a fact) into a separate table?
The text was updated successfully, but these errors were encountered:
@longhotsummer yes, I hit that brick wall myself a couple of months ago.
I did solve it partially, by changing the DISTINCT queries to GROUP BY ones - turns out they're much quicker (still inexcusably slow, though).
The changes are in this pull request: #22
I didn't merge it yet cause tests were failing for some reason and I didn't have time to understand why (but I'm still using that branch for the time being).
If you want to CR it, and maybe try and understand why those two tests are failing, it could be great. Either way you can take this branch and see if it works for you as well.
We have a postgresql table with about 28 million facts with a
financial_year
column. Users can use the babbage API to essentially query the distinctfinancial_year
values, which is about 10 unique values.Postgresql seems to be very naive when doing
SELECT DISTINCT financial_year FROM table
because it runs a table scan even thoughfinancial_year
has an index, which takes 60+ seconds. This seems to be a known problem with postgresql.How have others solved this problem? Do we split out the financial_year data (and all the other dimensions of a fact) into a separate table?
The text was updated successfully, but these errors were encountered: