Allowing extract_array calls to use pre-indexed grid information?

The recent conversation in theislab/zellkonverter#34 reminded me of some work I did in LTLA/beachmat#20. Briefly, the idea was to speed up row-based block processing of `dgCMatrix` by performing a single pass over the non-zero elements beforehand to identify the start and end of each row block in each column. This avoids the need for costly per-column binary searches when each row block is extracted in the usual way, and gives a ~10-fold speed-up in row-based processing of `dgCMatrix`es. 

Now I'm wondering whether this approach can be generalized somehow so that other **DelayedArray** backends can benefit. Perhaps functions like `rowAutoGrid()` can decorate the grid object with extra information that allows `extract_array` to efficiently obtain the necessary bits and pieces, if a suitable object like a `dgCMatrix` is passed? 

Happy to give this - or other ideas - a crack with a PR if there is some interest.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allowing extract_array calls to use pre-indexed grid information? #91

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Allowing extract_array calls to use pre-indexed grid information? #91

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions