Skip to content

Implement native interleave for ListView#9558

Open
vegarsti wants to merge 2 commits intoapache:mainfrom
vegarsti:list-view-interleave-native
Open

Implement native interleave for ListView#9558
vegarsti wants to merge 2 commits intoapache:mainfrom
vegarsti:list-view-interleave-native

Conversation

@vegarsti
Copy link
Copy Markdown
Contributor

@vegarsti vegarsti commented Mar 15, 2026

This PR adds a native implementation of interleave for the ListView type which uses a good heuristic thanks to @asubiotto, either

  1. copy each row's elements and put them all into a new flat array, or
  2. concatenate all source value array (and adjust offsets).

The latter is best when there is sharing of elements.

Closes #9342.

@github-actions github-actions Bot added the arrow Changes to the arrow crate label Mar 15, 2026
@vegarsti vegarsti force-pushed the list-view-interleave-native branch from a1131f2 to 16f2287 Compare March 15, 2026 15:42
@brancz
Copy link
Copy Markdown
Contributor

brancz commented Mar 16, 2026

Do you mind comparing this to the fallthrough performance of #9562 ?

@vegarsti
Copy link
Copy Markdown
Contributor Author

Do you mind comparing this to the fallthrough performance of #9562 ?

Oh for sure, thanks for reminding me!

@vegarsti
Copy link
Copy Markdown
Contributor Author

vegarsti commented Mar 16, 2026

Updated the description with results now. It's not looking like a win..!

@brancz
Copy link
Copy Markdown
Contributor

brancz commented Mar 16, 2026

I would say let's merge the fallthrough and iterate on this version. I'm sure there are several possibilities for optimizations.

@asubiotto
Copy link
Copy Markdown
Contributor

FWIW I pushed up the branch I've had marinating locally for a month or two in case it's helpful: main...polarsignals:arrow-rs:asubiotto/lvinterleave. I believe the benchmarks showed a slight regression for interleaves of small lists, but overall the perf was an improvement. I'm not able to take a closer look right now, but sharing in case it's helpful.

@vegarsti
Copy link
Copy Markdown
Contributor Author

FWIW I pushed up the branch I've had marinating locally for a month or two in case it's helpful: main...polarsignals:arrow-rs:asubiotto/lvinterleave. I believe the benchmarks showed a slight regression for interleaves of small lists, but overall the perf was an improvement. I'm not able to take a closer look right now, but sharing in case it's helpful.

Thank you!

@vegarsti vegarsti force-pushed the list-view-interleave-native branch from 6e8412e to b18c3a6 Compare March 19, 2026 12:48
@vegarsti
Copy link
Copy Markdown
Contributor Author

Updated implementation and results now!

@asubiotto
Copy link
Copy Markdown
Contributor

Sorry for dropping the ball on this! I think this is going in the right direction but when I pulled this in to try it out I realized that it doesn't work very well when interleaving listviews with a high number of shraed elements (i.e. offset/size windows are overlapping). I think we can get the best of both worlds by computing a heuristic: i.e. how many values are referenced vs how many values are in the backing array to figure out if we want to do per-row copies as this pr does or just a full concat of the backing slice which preserves overlapping encodings and can be much cheaper in the end. Here is a commit that implements that on top of this PR with a benchmark: polarsignals@7cb6880

There is a slight perf hit vs your branch to compute the heuristic (summing referenced sizes), but I think it's worth it in the grand scheme of things:

interleave list_view<i64>(0.1,0.1,20) 100 [0..100, 100..230, 450..1000]
                        time:   [2.8553 µs 2.8661 µs 2.8777 µs]
                        change: [−39.782% −39.429% −39.058%] (p = 0.00 < 0.05)
interleave list_view<i64>(0.1,0.1,20) 400 [0..100, 100..230, 450..1000]
                        time:   [8.2066 µs 8.2440 µs 8.2838 µs]
                        change: [−41.803% −41.460% −41.123%] (p = 0.00 < 0.05)
interleave list_view<i64>(0.1,0.1,20) 1024 [0..100, 100..230, 450..1000]
                        time:   [22.291 µs 22.424 µs 22.580 µs]
                        change: [−39.377% −38.883% −38.328%] (p = 0.00 < 0.05)
interleave list_view<i64>(0.1,0.1,20) 1024 [0..100, 100..230, 450..1000, 0..1000]
                        time:   [21.744 µs 21.868 µs 22.003 µs]
                        change: [−40.397% −39.966% −39.515%] (p = 0.00 < 0.05)
interleave list_view<i64>(0.0,0.0,20) 100 [0..100, 100..230, 450..1000]
                        time:   [1.7642 µs 1.7770 µs 1.7937 µs]
                        change: [−36.120% −35.680% −35.219%] (p = 0.00 < 0.05)
interleave list_view<i64>(0.0,0.0,20) 400 [0..100, 100..230, 450..1000]
                        time:   [5.1748 µs 5.2052 µs 5.2392 µs]
                        change: [−29.000% −28.500% −28.000%] (p = 0.00 < 0.05)
interleave list_view<i64>(0.0,0.0,20) 1024 [0..100, 100..230, 450..1000]
                        time:   [12.528 µs 12.631 µs 12.741 µs]
                        change: [−29.293% −28.511% −27.801%] (p = 0.00 < 0.05)
interleave list_view<i64>(0.0,0.0,20) 1024 [0..100, 100..230, 450..1000, 0..1000]
                        time:   [13.009 µs 13.098 µs 13.192 µs]
                        change: [−26.841% −26.193% −25.550%] (p = 0.00 < 0.05)
interleave list_view_overlapping<i64>(80x,20) 100 [0..100, 100..230, 450..1000]
                        time:   [1.8046 µs 1.8271 µs 1.8547 µs]
                        change: [−44.472% −43.715% −42.935%] (p = 0.00 < 0.05)
interleave list_view_overlapping<i64>(80x,20) 400 [0..100, 100..230, 450..1000]
                        time:   [3.3896 µs 3.4283 µs 3.4689 µs]
                        change: [−66.387% −66.073% −65.773%] (p = 0.00 < 0.05)
interleave list_view_overlapping<i64>(80x,20) 1024 [0..100, 100..230, 450..1000]
                        time:   [5.7748 µs 5.8133 µs 5.8482 µs]
                        change: [−72.104% −71.879% −71.641%] (p = 0.00 < 0.05)
interleave list_view_overlapping<i64>(80x,20) 1024 [0..100, 100..230, 450..1000, 0..1000]
                        time:   [6.2896 µs 6.3539 µs 6.4243 µs]
                        change: [−69.684% −69.377% −69.083%] (p = 0.00 < 0.05)

@alamb
Copy link
Copy Markdown
Contributor

alamb commented Apr 16, 2026

Sorry for dropping the ball on this! I think this is going in the right direction but when I pulled this in to try it out I realized that it doesn't work very well when interleaving listviews with a high number of shraed elements (i.e. offset/size windows are overlapping).

Could you perhaps make a PR that adds this case as a benchmark?

Comment thread arrow/benches/interleave_kernels.rs
alamb pushed a commit that referenced this pull request Apr 16, 2026
Ref #9558 (comment)

---------

Co-authored-by: Alfonso Subiotto Marques <alfonso.subiotto@polarsignals.com>
@vegarsti vegarsti force-pushed the list-view-interleave-native branch 2 times, most recently from c9f789d to 216213b Compare April 17, 2026 06:06
@asubiotto
Copy link
Copy Markdown
Contributor

@vegarsti what do you think of integrating the hybrid strategy onto this PR? polarsignals@7cb6880

@vegarsti
Copy link
Copy Markdown
Contributor Author

@vegarsti what do you think of integrating the hybrid strategy onto this PR? polarsignals@7cb6880

Thanks for reminding me! I think it's a good idea - I will do it! Will ping you in a bit

The previous implementation copies child elements per-row via MutableArrayData::extend(), destroying overlapping offset/size sharing. This matters for merge sorts over data like stacktrace profiles where many rows reference the same backing elements.

Use a hybrid strategy: compare concat cost (sum of source backing array lengths) vs per-row cost (sum of selected row sizes). When sharing exists (per-row cost > concat cost), concatenate backing arrays and adjust offsets to preserve sharing. Otherwise use the per-row copy.

Adds overlapping ListView test.

Signed-off-by: Alfonso Subiotto Marques <alfonso.subiotto@polarsignals.com>
@vegarsti vegarsti force-pushed the list-view-interleave-native branch from 216213b to a5efffd Compare April 29, 2026 10:32
@vegarsti
Copy link
Copy Markdown
Contributor Author

vegarsti commented Apr 29, 2026

I think this is a great heuristic, I've included the change. Maybe @alamb you could trigger the benchmarks on this PR? 🙏🏻

@alamb
Copy link
Copy Markdown
Contributor

alamb commented Apr 29, 2026

run benchmark interleave_kernels

@adriangbot
Copy link
Copy Markdown

🤖 Arrow criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4347425035-1923-5l8z2 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing list-view-interleave-native (a5efffd) to b114241 (merge-base) diff
BENCH_NAME=interleave_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench interleave_kernels
BENCH_FILTER=
Results will be posted here when complete


File an issue against this benchmark runner

@asubiotto
Copy link
Copy Markdown
Contributor

One thing I’m concerned about is using a blanket mutable array extend for any listview element type including complex nested types (e.g. we have list views of structs with dicts and more list views). The implication is that we’re using the equivalent of a “fallback” interleave path for all of these element types. Maybe it’s fine to merge this as is and revisit later, but worth keeping in mind.

@adriangbot
Copy link
Copy Markdown

🤖 Arrow criterion benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                                                                                        list-view-interleave-native            main
-----                                                                                        ---------------------------            ----
interleave dict(20, 0.0) 100 [0..100, 100..230, 450..1000]                                   1.01    642.8±3.04ns        ? ?/sec    1.00    638.6±3.38ns        ? ?/sec
interleave dict(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                         1.01  1856.3±12.06ns        ? ?/sec    1.00  1841.9±10.22ns        ? ?/sec
interleave dict(20, 0.0) 1024 [0..100, 100..230, 450..1000]                                  1.00   1814.2±7.21ns        ? ?/sec    1.00   1821.8±9.61ns        ? ?/sec
interleave dict(20, 0.0) 400 [0..100, 100..230, 450..1000]                                   1.01   1026.6±3.00ns        ? ?/sec    1.00   1019.4±6.93ns        ? ?/sec
interleave dict_distinct 100                                                                 1.01      2.1±0.04µs        ? ?/sec    1.00      2.1±0.01µs        ? ?/sec
interleave dict_distinct 1024                                                                1.02      2.1±0.03µs        ? ?/sec    1.00      2.1±0.01µs        ? ?/sec
interleave dict_distinct 2048                                                                1.02      2.2±0.03µs        ? ?/sec    1.00      2.1±0.01µs        ? ?/sec
interleave dict_sparse(20, 0.0) 100 [0..100, 100..230, 450..1000]                            1.00   1528.6±3.88ns        ? ?/sec    1.01   1541.7±5.38ns        ? ?/sec
interleave dict_sparse(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                  1.00      3.0±0.01µs        ? ?/sec    1.00      3.0±0.01µs        ? ?/sec
interleave dict_sparse(20, 0.0) 1024 [0..100, 100..230, 450..1000]                           1.00      2.7±0.01µs        ? ?/sec    1.00      2.7±0.01µs        ? ?/sec
interleave dict_sparse(20, 0.0) 400 [0..100, 100..230, 450..1000]                            1.01   1940.0±2.88ns        ? ?/sec    1.00   1929.4±5.38ns        ? ?/sec
interleave i32(0.0) 100 [0..100, 100..230, 450..1000]                                        1.00    212.2±1.90ns        ? ?/sec    1.00    211.9±2.20ns        ? ?/sec
interleave i32(0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                              1.00    940.9±2.49ns        ? ?/sec    1.01    953.8±3.53ns        ? ?/sec
interleave i32(0.0) 1024 [0..100, 100..230, 450..1000]                                       1.04    987.8±2.22ns        ? ?/sec    1.00    951.4±2.95ns        ? ?/sec
interleave i32(0.0) 400 [0..100, 100..230, 450..1000]                                        1.02    529.7±3.34ns        ? ?/sec    1.00    518.4±2.30ns        ? ?/sec
interleave i32(0.5) 100 [0..100, 100..230, 450..1000]                                        1.00    440.9±4.02ns        ? ?/sec    1.01    443.3±4.76ns        ? ?/sec
interleave i32(0.5) 1024 [0..100, 100..230, 450..1000, 0..1000]                              1.00      2.9±0.02µs        ? ?/sec    1.01      3.0±0.02µs        ? ?/sec
interleave i32(0.5) 1024 [0..100, 100..230, 450..1000]                                       1.00      2.9±0.02µs        ? ?/sec    1.00      3.0±0.02µs        ? ?/sec
interleave i32(0.5) 400 [0..100, 100..230, 450..1000]                                        1.00   1274.7±8.21ns        ? ?/sec    1.04   1328.7±8.31ns        ? ?/sec
interleave list<i64>(0.0,0.0,20) 100 [0..100, 100..230, 450..1000]                           1.00   1633.0±4.01ns        ? ?/sec    1.02   1666.3±3.14ns        ? ?/sec
interleave list<i64>(0.0,0.0,20) 1024 [0..100, 100..230, 450..1000, 0..1000]                 1.00     14.9±0.03µs        ? ?/sec    1.01     15.0±0.10µs        ? ?/sec
interleave list<i64>(0.0,0.0,20) 1024 [0..100, 100..230, 450..1000]                          1.00     14.8±0.02µs        ? ?/sec    1.01     15.0±0.03µs        ? ?/sec
interleave list<i64>(0.0,0.0,20) 400 [0..100, 100..230, 450..1000]                           1.00      6.0±0.01µs        ? ?/sec    1.01      6.1±0.01µs        ? ?/sec
interleave list<i64>(0.1,0.1,20) 100 [0..100, 100..230, 450..1000]                           1.00      3.9±0.02µs        ? ?/sec    1.01      3.9±0.01µs        ? ?/sec
interleave list<i64>(0.1,0.1,20) 1024 [0..100, 100..230, 450..1000, 0..1000]                 1.00     32.1±0.12µs        ? ?/sec    1.02     32.7±0.19µs        ? ?/sec
interleave list<i64>(0.1,0.1,20) 1024 [0..100, 100..230, 450..1000]                          1.00     32.4±0.17µs        ? ?/sec    1.00     32.5±0.19µs        ? ?/sec
interleave list<i64>(0.1,0.1,20) 400 [0..100, 100..230, 450..1000]                           1.00     12.9±0.10µs        ? ?/sec    1.00     13.0±0.05µs        ? ?/sec
interleave list_view<i64>(0.0,0.0,20) 100 [0..100, 100..230, 450..1000]                      1.00      2.2±0.01µs        ? ?/sec    1.66      3.6±0.01µs        ? ?/sec
interleave list_view<i64>(0.0,0.0,20) 1024 [0..100, 100..230, 450..1000, 0..1000]            1.00     13.6±0.03µs        ? ?/sec    1.28     17.4±0.04µs        ? ?/sec
interleave list_view<i64>(0.0,0.0,20) 1024 [0..100, 100..230, 450..1000]                     1.00     13.7±0.15µs        ? ?/sec    1.24     17.1±0.04µs        ? ?/sec
interleave list_view<i64>(0.0,0.0,20) 400 [0..100, 100..230, 450..1000]                      1.00      5.9±0.01µs        ? ?/sec    1.37      8.1±0.02µs        ? ?/sec
interleave list_view<i64>(0.1,0.1,20) 100 [0..100, 100..230, 450..1000]                      1.00      3.6±0.03µs        ? ?/sec    1.71      6.2±0.01µs        ? ?/sec
interleave list_view<i64>(0.1,0.1,20) 1024 [0..100, 100..230, 450..1000, 0..1000]            1.00     24.4±0.18µs        ? ?/sec    1.43     34.9±0.04µs        ? ?/sec
interleave list_view<i64>(0.1,0.1,20) 1024 [0..100, 100..230, 450..1000]                     1.00     24.6±0.20µs        ? ?/sec    1.41     34.7±0.04µs        ? ?/sec
interleave list_view<i64>(0.1,0.1,20) 400 [0..100, 100..230, 450..1000]                      1.00     10.5±0.10µs        ? ?/sec    1.50     15.7±0.04µs        ? ?/sec
interleave list_view_overlapping<i64>(80x,20) 100 [0..100, 100..230, 450..1000]              1.00      2.3±0.01µs        ? ?/sec    1.72      4.0±0.01µs        ? ?/sec
interleave list_view_overlapping<i64>(80x,20) 1024 [0..100, 100..230, 450..1000, 0..1000]    1.00      6.2±0.03µs        ? ?/sec    3.58     22.1±0.06µs        ? ?/sec
interleave list_view_overlapping<i64>(80x,20) 1024 [0..100, 100..230, 450..1000]             1.00      6.0±0.03µs        ? ?/sec    3.76     22.5±0.04µs        ? ?/sec
interleave list_view_overlapping<i64>(80x,20) 400 [0..100, 100..230, 450..1000]              1.00      3.0±0.01µs        ? ?/sec    3.35     10.1±0.02µs        ? ?/sec
interleave str(20, 0.0) 100 [0..100, 100..230, 450..1000]                                    1.01    611.6±0.93ns        ? ?/sec    1.00    607.7±2.74ns        ? ?/sec
interleave str(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                          1.00      4.6±0.01µs        ? ?/sec    1.01      4.7±0.02µs        ? ?/sec
interleave str(20, 0.0) 1024 [0..100, 100..230, 450..1000]                                   1.00      4.6±0.01µs        ? ?/sec    1.01      4.6±0.02µs        ? ?/sec
interleave str(20, 0.0) 400 [0..100, 100..230, 450..1000]                                    1.00   1920.2±5.36ns        ? ?/sec    1.01   1934.5±7.33ns        ? ?/sec
interleave str(20, 0.5) 100 [0..100, 100..230, 450..1000]                                    1.00    746.3±1.13ns        ? ?/sec    1.00    745.3±2.37ns        ? ?/sec
interleave str(20, 0.5) 1024 [0..100, 100..230, 450..1000, 0..1000]                          1.06      6.3±0.03µs        ? ?/sec    1.00      5.9±0.02µs        ? ?/sec
interleave str(20, 0.5) 1024 [0..100, 100..230, 450..1000]                                   1.00      5.9±0.02µs        ? ?/sec    1.00      5.9±0.02µs        ? ?/sec
interleave str(20, 0.5) 400 [0..100, 100..230, 450..1000]                                    1.00      2.5±0.01µs        ? ?/sec    1.00      2.5±0.00µs        ? ?/sec
interleave str_view(0.0) 100 [0..100, 100..230, 450..1000]                                   1.01    618.4±4.55ns        ? ?/sec    1.00    612.6±7.01ns        ? ?/sec
interleave str_view(0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                         1.01      2.6±0.01µs        ? ?/sec    1.00      2.6±0.01µs        ? ?/sec
interleave str_view(0.0) 1024 [0..100, 100..230, 450..1000]                                  1.00      2.6±0.01µs        ? ?/sec    1.00      2.6±0.01µs        ? ?/sec
interleave str_view(0.0) 400 [0..100, 100..230, 450..1000]                                   1.00   1231.4±4.18ns        ? ?/sec    1.01   1240.4±7.98ns        ? ?/sec
interleave struct(i32(0.0), i32(0.0) 100 [0..100, 100..230, 450..1000]                       1.00   652.0±11.66ns        ? ?/sec    1.00    651.6±8.41ns        ? ?/sec
interleave struct(i32(0.0), i32(0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]             1.00      2.1±0.01µs        ? ?/sec    1.02      2.2±0.01µs        ? ?/sec
interleave struct(i32(0.0), i32(0.0) 1024 [0..100, 100..230, 450..1000]                      1.00      2.1±0.01µs        ? ?/sec    1.02      2.2±0.00µs        ? ?/sec
interleave struct(i32(0.0), i32(0.0) 400 [0..100, 100..230, 450..1000]                       1.01   1164.3±7.83ns        ? ?/sec    1.00   1154.5±3.23ns        ? ?/sec
interleave struct(i32(0.0), str(20, 0.0) 100 [0..100, 100..230, 450..1000]                   1.01   1037.3±5.30ns        ? ?/sec    1.00   1026.1±4.97ns        ? ?/sec
interleave struct(i32(0.0), str(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]         1.00      5.8±0.02µs        ? ?/sec    1.01      5.9±0.01µs        ? ?/sec
interleave struct(i32(0.0), str(20, 0.0) 1024 [0..100, 100..230, 450..1000]                  1.00      5.8±0.01µs        ? ?/sec    1.01      5.8±0.02µs        ? ?/sec
interleave struct(i32(0.0), str(20, 0.0) 400 [0..100, 100..230, 450..1000]                   1.00      2.6±0.01µs        ? ?/sec    1.00      2.6±0.01µs        ? ?/sec
interleave struct(str(20, 0.0), str(20, 0.0)) 100 [0..100, 100..230, 450..1000]              1.01   1410.4±3.74ns        ? ?/sec    1.00   1395.1±4.85ns        ? ?/sec
interleave struct(str(20, 0.0), str(20, 0.0)) 1024 [0..100, 100..230, 450..1000, 0..1000]    1.00      9.6±0.02µs        ? ?/sec    1.00      9.6±0.02µs        ? ?/sec
interleave struct(str(20, 0.0), str(20, 0.0)) 1024 [0..100, 100..230, 450..1000]             1.00      9.5±0.02µs        ? ?/sec    1.00      9.5±0.01µs        ? ?/sec
interleave struct(str(20, 0.0), str(20, 0.0)) 400 [0..100, 100..230, 450..1000]              1.00      4.1±0.01µs        ? ?/sec    1.01      4.1±0.01µs        ? ?/sec

Resource Usage

base (merge-base)

Metric Value
Wall time 610.1s
Peak memory 3.1 GiB
Avg memory 3.0 GiB
CPU user 604.4s
CPU sys 0.8s
Peak spill 0 B

branch

Metric Value
Wall time 595.1s
Peak memory 3.0 GiB
Avg memory 3.0 GiB
CPU user 593.7s
CPU sys 0.2s
Peak spill 0 B

File an issue against this benchmark runner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add native ListView support for interleave kernel

5 participants