Description
The recent conversation in theislab/zellkonverter#34 reminded me of some work I did in tatami-inc/beachmat#20. Briefly, the idea was to speed up row-based block processing of dgCMatrix
by performing a single pass over the non-zero elements beforehand to identify the start and end of each row block in each column. This avoids the need for costly per-column binary searches when each row block is extracted in the usual way, and gives a ~10-fold speed-up in row-based processing of dgCMatrix
es.
Now I'm wondering whether this approach can be generalized somehow so that other DelayedArray backends can benefit. Perhaps functions like rowAutoGrid()
can decorate the grid object with extra information that allows extract_array
to efficiently obtain the necessary bits and pieces, if a suitable object like a dgCMatrix
is passed?
Happy to give this - or other ideas - a crack with a PR if there is some interest.