Use the Arrow C and PyCapsule data interfaces to share data with Python #98
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See:
This basically allows us to do the following:
std::share_ptr<arrow::Table>
.<arrow/c/abi.h>
) and C bridge (<arrow/c/bridge.h>
) to export data generated with the C++ API (which is not ABI stable) through the C ABI (which is stable).PyCapsule
s. We don't even need to depend on nanoarrow to do this, as all we need to do is return Python objects that expose thePyCapsule
s through the__arrow_c_schema__
or__arrow_c_stream__
attributes.pyarrow.Table
given theArrowSchema
andArrowArrayStream
objects returned by the C data interface.As a bonus, we no longer have to depend on pyarrow (we only need it when fetching pixels or bins as DataFrames).
The only downside of this approach is that pyarrow versions <16 are not supported (because
pyarrow.Table.from_arrays()
does not recognize objects exposing the__arrow_c_stream__
attribute).Closes #91.