feat: Optimize from_bitwise_binary_op with 64-bit alignment by kunalsinghdadhwal · Pull Request #9441 · apache/arrow-rs

kunalsinghdadhwal · 2026-02-19T17:26:09Z

Which issue does this PR close?

Closes Optimize from_bitwise_binary_op #9378

Rationale for this change

the optimizations as listed in the issue description

Align to 8 bytes
Don't try to return a buffer with bit_offset 0 but round it to a multiple of 64
Use chunk_exact for the fallback path

What changes are included in this PR?

When both inputs share the same sub-64-bit alignment (left_offset % 64 == right_offset % 64), the optimized path is used. This covers the common cases (both offset 0, both sliced equally, etc.). The BitChunks fallback is retained only when the two offsets have different sub-64-bit alignment.

Are these changes tested?

Yes the tests are changed and they are included

Are there any user-facing changes?

Yes, this is a minor breaking change to from_bitwise_binary_op:

The returned BooleanBuffer may now have a non-zero offset (previously always 0)
The returned BooleanBuffer may have padding bits set outside the logical range in values()

Signed-off-by: Kunal Singh Dadhwal <kunalsinghdadhwal@gmail.com>

kunalsinghdadhwal · 2026-02-19T17:31:35Z

@Dandandan kindly review

Dandandan · 2026-02-19T18:12:42Z

run benchmark boolean_kernels

kunalsinghdadhwal · 2026-02-19T18:26:34Z

and                     time:   [129.08 ns 129.76 ns 130.46 ns]
Found 8 outliers among 100 measurements (8.00%)
  1 (1.00%) low severe
  3 (3.00%) low mild
  2 (2.00%) high mild
  2 (2.00%) high severe

or                      time:   [134.48 ns 135.29 ns 136.17 ns]
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) low mild
  2 (2.00%) high mild
  1 (1.00%) high severe

not                     time:   [91.808 ns 92.431 ns 93.130 ns]
Found 6 outliers among 100 measurements (6.00%)
  4 (4.00%) high mild
  2 (2.00%) high severe

and_sliced_1            time:   [596.55 ns 600.04 ns 604.23 ns]
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

or_sliced_1             time:   [599.21 ns 601.99 ns 604.87 ns]
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild

not_sliced_1            time:   [90.421 ns 90.955 ns 91.544 ns]
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe

and_sliced_24           time:   [116.06 ns 116.83 ns 117.75 ns]
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) low mild
  2 (2.00%) high mild
  2 (2.00%) high severe

or_sliced_24            time:   [116.09 ns 116.94 ns 117.91 ns]
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild

not_slice_24            time:   [90.518 ns 91.550 ns 92.754 ns]
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

here is the comparsion

Benchmark	main	optimized	speedup
and	128.33 ns	130.22 ns	0.98x
or	132.71 ns	134.03 ns	0.99x
not	91.78 ns	91.78 ns	1.00x
and_sliced_1	656.07 ns	650.42 ns	1.01x
or_sliced_1	669.51 ns	662.51 ns	1.01x
not_sliced_1	114.27 ns	112.00 ns	1.02x
and_sliced_24	141.51 ns	139.42 ns	1.01x
or_sliced_24	138.28 ns	114.78 ns	1.20x
not_slice_24	90.24 ns	113.18 ns	0.80x

kunalsinghdadhwal · 2026-02-20T07:29:32Z

@Dandandan @alamb

kunalsinghdadhwal · 2026-02-23T07:26:28Z

kindly review @Dandandan

alamb-ghbot · 2026-02-23T15:31:13Z

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing kunal/optimize-bitwise-binary-op-9378 (ecf51b4) to ab9c062 diff
BENCH_NAME=boolean_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench boolean_kernels
BENCH_FILTER=
BENCH_BRANCH_NAME=kunal_optimize-bitwise-binary-op-9378
Results will be posted here when complete

alamb-ghbot · 2026-02-23T15:36:12Z

🤖: Benchmark completed

Details

group            kunal_optimize-bitwise-binary-op-9378    main
-----            -------------------------------------    ----
and              1.02    214.1±2.60ns        ? ?/sec      1.00    210.8±1.34ns        ? ?/sec
and_sliced_1     1.00  1090.9±11.59ns        ? ?/sec      1.01   1096.5±5.50ns        ? ?/sec
and_sliced_24    1.00    225.4±0.42ns        ? ?/sec      1.09    246.4±0.83ns        ? ?/sec
not              1.00    144.1±1.59ns        ? ?/sec      1.00    144.4±0.17ns        ? ?/sec
not_slice_24     1.20    174.3±0.28ns        ? ?/sec      1.00    145.5±6.71ns        ? ?/sec
not_sliced_1     1.21    174.5±1.33ns        ? ?/sec      1.00    144.6±1.09ns        ? ?/sec
or               1.02    202.4±1.24ns        ? ?/sec      1.00    198.8±0.40ns        ? ?/sec
or_sliced_1      1.00  1094.9±10.78ns        ? ?/sec      1.01   1110.5±8.90ns        ? ?/sec
or_sliced_24     1.00    227.3±0.34ns        ? ?/sec      1.09    247.0±4.87ns        ? ?/sec

kunalsinghdadhwal · 2026-02-24T15:54:51Z

kindly review and merge @Dandandan

alamb · 2026-03-18T14:16:22Z

run benchmark boolean_kernels

adriangbot · 2026-03-18T14:18:18Z

🤖 Arrow criterion benchmark running (GKE) | trigger
Linux bench-c4082907033-418-r4mmn 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux
Comparing kunal/optimize-bitwise-binary-op-9378 (4c4f205) to 66313ae (merge-base) diff
BENCH_NAME=boolean_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench boolean_kernels
BENCH_FILTER=
Results will be posted here when complete

adriangbot · 2026-03-18T14:22:15Z

🤖 Arrow criterion benchmark completed (GKE) | trigger

Details

group            kunal_optimize-bitwise-binary-op-9378    main
-----            -------------------------------------    ----
and              1.04    152.2±0.71ns        ? ?/sec      1.00    146.4±0.67ns        ? ?/sec
and_sliced_1     1.00    559.5±1.33ns        ? ?/sec      1.13    631.5±0.74ns        ? ?/sec
and_sliced_24    1.00    174.8±3.22ns        ? ?/sec      1.57    273.6±0.91ns        ? ?/sec
not              1.01    107.3±0.95ns        ? ?/sec      1.00    106.1±0.46ns        ? ?/sec
not_slice_24     1.16    123.1±0.66ns        ? ?/sec      1.00    106.1±0.45ns        ? ?/sec
not_sliced_1     1.17    123.7±0.33ns        ? ?/sec      1.00    105.9±0.43ns        ? ?/sec
or               1.04    151.3±0.87ns        ? ?/sec      1.00    146.0±0.75ns        ? ?/sec
or_sliced_1      1.00    596.0±1.23ns        ? ?/sec      1.01    601.7±0.73ns        ? ?/sec
or_sliced_24     1.00    174.5±3.29ns        ? ?/sec      1.58    275.6±0.85ns        ? ?/sec

Resource Usage

base (merge-base)

Metric	Value
Wall time	88.8s
Peak memory	1.7 GiB
Avg memory	1.7 GiB
CPU user	87.8s
CPU sys	0.7s
Disk read	0 B
Disk write	596.1 MiB

branch

Metric	Value
Wall time	90.8s
Peak memory	1.7 GiB
Avg memory	1.7 GiB
CPU user	90.7s
CPU sys	0.1s
Disk read	0 B
Disk write	1004.0 KiB

alamb · 2026-03-18T14:28:52Z

Looks like a solid performance improvement. I will review this shortly

…e-binary-op-9378

alamb

Thank you very much @kunalsinghdadhwal

I went through this code carefully and it makes sense. I also spent quite a while ensuring the coverage is good and the comments make sense

I believe the change to the offset invariants should be treated as an API change and thus we should wait for the next major release

alamb · 2026-03-18T15:25:55Z

    /// * `op` may be called with input bits outside the requested range.
-    /// * The returned `BooleanBuffer` always has zero offset.
+    /// * Returned `BooleanBuffer` may have non zero offset
+    /// * Returned `BooleanBuffer` may have bits set outside the requested range


this may be treated as an API change 🤔

alamb · 2026-03-18T15:48:58Z

I took the liberty of pushing commits to this PR

alamb · 2026-03-18T15:52:48Z

FYI @jhorstmann and @Dandandan you may be interested in this PR

kunalsinghdadhwal · 2026-03-19T05:55:49Z

Thanks @alamb for reviewing this waiting for the next release

Dandandan · 2026-03-19T07:02:47Z

I would have thought and/or 24 to improve more, perhaps it's still generating suboptimal code for those...

alamb · 2026-03-20T14:47:24Z

Well, since it went in to main it will be part of 58.1.0. I'll test in DataFusion to make sure

Release arrow-rs / parquet Minor version 58.1.0 (March 2026) #9108

feat: Optimize from_bitwise_binary_op with 64-bit alignment

ecf51b4

Signed-off-by: Kunal Singh Dadhwal <kunalsinghdadhwal@gmail.com>

github-actions Bot added the arrow Changes to the arrow crate label Feb 19, 2026

Merge branch 'main' into kunal/optimize-bitwise-binary-op-9378

4c4f205

alamb added 3 commits March 18, 2026 11:22

Add test coverage

343530d

More tests and comments

58fabf1

Merge remote-tracking branch 'apache/main' into kunal/optimize-bitwis…

6d7fe18

…e-binary-op-9378

alamb approved these changes Mar 18, 2026

View reviewed changes

fmt

7d96705

alamb added the next-major-release the PR has API changes and it waiting on the next major version label Mar 18, 2026

Dandandan approved these changes Mar 19, 2026

View reviewed changes

Dandandan merged commit d53df60 into apache:main Mar 19, 2026
26 checks passed

alamb mentioned this pull request Mar 20, 2026

Optimize from_bitwise_binary_op #9378

Closed

alamb removed the next-major-release the PR has API changes and it waiting on the next major version label Mar 20, 2026

kszucs mentioned this pull request Apr 1, 2026

fix(parquet): fix CDC panic on nested ListArrays with null entries kszucs/arrow-rs#2

Closed

Conversation

kunalsinghdadhwal commented Feb 19, 2026

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

kunalsinghdadhwal commented Feb 19, 2026

Uh oh!

Dandandan commented Feb 19, 2026

Uh oh!

kunalsinghdadhwal commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kunalsinghdadhwal commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kunalsinghdadhwal commented Feb 23, 2026

Uh oh!

alamb-ghbot commented Feb 23, 2026

Uh oh!

alamb-ghbot commented Feb 23, 2026

Uh oh!

kunalsinghdadhwal commented Feb 24, 2026

Uh oh!

alamb commented Mar 18, 2026

Uh oh!

adriangbot commented Mar 18, 2026

Uh oh!

adriangbot commented Mar 18, 2026

Uh oh!

alamb commented Mar 18, 2026

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

alamb Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

alamb commented Mar 18, 2026

Uh oh!

alamb commented Mar 18, 2026

Uh oh!

kunalsinghdadhwal commented Mar 19, 2026

Uh oh!

Dandandan commented Mar 19, 2026

Uh oh!

Uh oh!

alamb commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

kunalsinghdadhwal commented Feb 19, 2026 •

edited

Loading

kunalsinghdadhwal commented Feb 20, 2026 •

edited

Loading