Skip to content

Use metadata cache when fetching metadata in geoparquet reader #250

@paleolimbot

Description

@paleolimbot

In #237 we updated to DataFusion 50, which includes some new features for caching metadata. In our GeoParquet reader we do one or two extra metadata fetches that would benefit greatly from using this! Some of the code we had to copy over doesn't pipe through the global cache.

Upstream:

DFParquetMetadata::new(store.as_ref(), object)
                    .with_metadata_size_hint(self.metadata_size_hint())
                    .with_decryption_properties(file_decryption_properties.as_ref())
                    .with_file_metadata_cache(Some(Arc::clone(&file_metadata_cache)))
                    .with_coerce_int96(coerce_int96)
                    .fetch_schema_with_location()
                    .await?;

Our code:

DFParquetMetadata::new(store.as_ref(), object)
.with_metadata_size_hint(self.inner().metadata_size_hint())
.fetch_metadata()
.await

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions