Skip to content

Fix #328 - Dont do limit pushdown during parallel execution #330

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

noahisaksen
Copy link
Contributor

@noahisaksen noahisaksen commented May 29, 2025

#328

LIMIT pushdown was added to this extension. #313

It pushes LIMIT down to Postgres and removes it from the DuckDB query plan.
Problem is that parallelism can occur in the extension, and one Query gets turned into n postgres queries. Pushing LIMIT down then falsifies the result.

A test is added, using set pg_pages_per_task=1, that reproduces this by forcing parallelism.

I added a check on max_threads, which makes it so that LIMIT is NOT pushed down, if it is anything other than 1.
This should prevent LIMIT from being pushed down in any case where we would have parallel queries from the best of my knowledge. (There could be a better way to check for parallelism that I am not aware of)

Previously

set pg_pages_per_task=1;
FROM s.large_tbl LIMIT 1;

COPY (SELECT "i" FROM "public"."large_tbl" WHERE ctid BETWEEN '(0,0)'::tid AND '(1,0)'::tid LIMIT 1) TO STDOUT (FORMAT "binary");

COPY (SELECT "i" FROM "public"."large_tbl" WHERE ctid BETWEEN '(1,0)'::tid AND '(2,0)'::tid LIMIT 1) TO STDOUT (FORMAT "binary");

COPY (SELECT "i" FROM "public"."large_tbl" WHERE ctid BETWEEN '(2,0)'::tid AND '(3,0)'::tid LIMIT 1) TO STDOUT (FORMAT "binary");

COPY (SELECT "i" FROM "public"."large_tbl" WHERE ctid BETWEEN '(3,0)'::tid AND '(4,0)'::tid LIMIT 1) TO STDOUT (FORMAT "binary");

COPY (SELECT "i" FROM "public"."large_tbl" WHERE ctid BETWEEN '(4,0)'::tid AND '(5,0)'::tid LIMIT 1) TO STDOUT (FORMAT "binary");

┌────────────┐
│     i      │
│   int64    │
├────────────┤
│      70286 │
│      72772 │
│      74128 │
│      75032 │
│      76388 │
│      79100

AFTER (NO LIMIT DURING PARALLEL)

set pg_pages_per_task=1;
FROM s.large_tbl LIMIT 1;

COPY (SELECT "i" FROM "public"."large_tbl" WHERE ctid BETWEEN '(0,0)'::tid AND '(1,0)'::tid) TO STDOUT (FORMAT "binary");

COPY (SELECT "i" FROM "public"."large_tbl" WHERE ctid BETWEEN '(1,0)'::tid AND '(2,0)'::tid) TO STDOUT (FORMAT "binary");

COPY (SELECT "i" FROM "public"."large_tbl" WHERE ctid BETWEEN '(2,0)'::tid AND '(3,0)'::tid) TO STDOUT (FORMAT "binary");

COPY (SELECT "i" FROM "public"."large_tbl" WHERE ctid BETWEEN '(3,0)'::tid AND '(4,0)'::tid) TO STDOUT (FORMAT "binary");

COPY (SELECT "i" FROM "public"."large_tbl" WHERE ctid BETWEEN '(4,0)'::tid AND '(5,0)'::tid) TO STDOUT (FORMAT "binary");

┌───────┐
│   i   │
│ int64 │
├───────┤
│   0   │
└───────┘

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant