Replies: 2 comments 4 replies
-
Thanks, we are in the process of swapping out the http-backend, at the moment you should be able to experiment from that via:
Then running your query like: CALL enable_logging('HTTP');
SELECT * FROM read_parquet('https://overturemaps-us-west-2.s3.us-west-2.amazonaws.com/release/2025-09-24.0/theme=places/type=place/part-00000-e9dcc321-c94a-41a9-a475-8123482f8fab-c000.zstd.parquet') LIMIT 10;
FROM duckdb_logs_parsed('HTTP') SELECT request.type, count(*) GROUP BY request.type; should return:
(both native AND wasm) while in the second case I do run into a weird error, I would have a look. Note that after some more rounds of testing, |
Beta Was this translation helpful? Give feedback.
-
Second example is a bit more tricky, it's sort of solved by #2108, but it gets a CORS error on the listing request. Do you by any chance have control over the bucket? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
The main branch of DuckDB supports partial reads of remote parquet files, allowing to only need to download part of the data instead of the entire file. DuckDB WASM on the other hand seems to need to download the entire file for any query.
Example query:
Running this example query for example downloads the entire 1GB file on DuckDB WASM but way less on for example the Python DuckDB instance.
Side note
On a side note DuckDB WASM does also not support reading split parquet files from S3:
This query fails with this error on DuckDB WASM while it works with normal DuckDB.
Beta Was this translation helpful? Give feedback.
All reactions