Implement native interleave for ListView#9558
Conversation
a1131f2 to
16f2287
Compare
|
Do you mind comparing this to the fallthrough performance of #9562 ? |
Oh for sure, thanks for reminding me! |
|
Updated the description with results now. It's not looking like a win..! |
|
I would say let's merge the fallthrough and iterate on this version. I'm sure there are several possibilities for optimizations. |
|
FWIW I pushed up the branch I've had marinating locally for a month or two in case it's helpful: main...polarsignals:arrow-rs:asubiotto/lvinterleave. I believe the benchmarks showed a slight regression for interleaves of small lists, but overall the perf was an improvement. I'm not able to take a closer look right now, but sharing in case it's helpful. |
Thank you! |
6e8412e to
b18c3a6
Compare
|
Updated implementation and results now! |
|
Sorry for dropping the ball on this! I think this is going in the right direction but when I pulled this in to try it out I realized that it doesn't work very well when interleaving listviews with a high number of shraed elements (i.e. offset/size windows are overlapping). I think we can get the best of both worlds by computing a heuristic: i.e. how many values are referenced vs how many values are in the backing array to figure out if we want to do per-row copies as this pr does or just a full concat of the backing slice which preserves overlapping encodings and can be much cheaper in the end. Here is a commit that implements that on top of this PR with a benchmark: polarsignals@7cb6880 There is a slight perf hit vs your branch to compute the heuristic (summing referenced sizes), but I think it's worth it in the grand scheme of things: |
Could you perhaps make a PR that adds this case as a benchmark? |
Ref #9558 (comment) --------- Co-authored-by: Alfonso Subiotto Marques <alfonso.subiotto@polarsignals.com>
c9f789d to
216213b
Compare
|
@vegarsti what do you think of integrating the hybrid strategy onto this PR? polarsignals@7cb6880 |
Thanks for reminding me! I think it's a good idea - I will do it! Will ping you in a bit |
The previous implementation copies child elements per-row via MutableArrayData::extend(), destroying overlapping offset/size sharing. This matters for merge sorts over data like stacktrace profiles where many rows reference the same backing elements. Use a hybrid strategy: compare concat cost (sum of source backing array lengths) vs per-row cost (sum of selected row sizes). When sharing exists (per-row cost > concat cost), concatenate backing arrays and adjust offsets to preserve sharing. Otherwise use the per-row copy. Adds overlapping ListView test. Signed-off-by: Alfonso Subiotto Marques <alfonso.subiotto@polarsignals.com>
216213b to
a5efffd
Compare
|
I think this is a great heuristic, I've included the change. Maybe @alamb you could trigger the benchmarks on this PR? 🙏🏻 |
|
run benchmark interleave_kernels |
|
🤖 Arrow criterion benchmark running (GKE) | trigger CPU Details (lscpu)Comparing list-view-interleave-native (a5efffd) to b114241 (merge-base) diff File an issue against this benchmark runner |
|
One thing I’m concerned about is using a blanket mutable array extend for any listview element type including complex nested types (e.g. we have list views of structs with dicts and more list views). The implication is that we’re using the equivalent of a “fallback” interleave path for all of these element types. Maybe it’s fine to merge this as is and revisit later, but worth keeping in mind. |
|
🤖 Arrow criterion benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagebase (merge-base)
branch
File an issue against this benchmark runner |
This PR adds a native implementation of interleave for the ListView type which uses a good heuristic thanks to @asubiotto, either
The latter is best when there is sharing of elements.
Closes #9342.