Add `append_value_n` to GenericByteBuilder by Fokko · Pull Request #9426 · apache/arrow-rs

Fokko · 2026-02-18T08:33:21Z

Which issue does this PR close?

Closes Add append_value_n to GenericByteBuilder #9425.

Rationale for this change

I noticed that this method is available on PrimitiveTypeBuilder, but missing on the GenericByteBuilder, which make sense since the gain is less, but after benchmarking, it shows a solid 10%. Mostly because the more efficient allocation of the null-mask.

┌───────────────────┬────────────────┬───────────────────┬─────────┐
│     Benchmark     │ append_value_n │ append_value loop │ Speedup │
├───────────────────┼────────────────┼───────────────────┼─────────┤
│ n=100/len=5       │ 371 ns         │ 408 ns            │ 10%     │
├───────────────────┼────────────────┼───────────────────┼─────────┤
│ n=100/len=30      │ 456 ns         │ 507 ns            │ 10%     │
├───────────────────┼────────────────┼───────────────────┼─────────┤
│ n=100/len=1024    │ 1.81 µs        │ 1.95 µs           │ 8%      │
├───────────────────┼────────────────┼───────────────────┼─────────┤
│ n=1000/len=5      │ 2.39 µs        │ 2.87 µs           │ 17%     │
├───────────────────┼────────────────┼───────────────────┼─────────┤
│ n=1000/len=30     │ 3.41 µs        │ 3.89 µs           │ 12%     │
├───────────────────┼────────────────┼───────────────────┼─────────┤
│ n=1000/len=1024   │ 12.3 µs        │ 14.4 µs           │ 15%     │
├───────────────────┼────────────────┼───────────────────┼─────────┤
│ n=10000/len=5     │ 23.8 µs        │ 29.3 µs           │ 19%     │
├───────────────────┼────────────────┼───────────────────┼─────────┤
│ n=10000/len=30    │ 33.7 µs        │ 39.0 µs           │ 14%     │
├───────────────────┼────────────────┼───────────────────┼─────────┤
│ n=10000/len=1024  │ 115.9 µs       │ 135.0 µs          │ 14%     │
├───────────────────┼────────────────┼───────────────────┼─────────┤
│ n=100000/len=5    │ 227.5 µs       │ 278.6 µs          │ 18%     │
├───────────────────┼────────────────┼───────────────────┼─────────┤
│ n=100000/len=30   │ 328.1 µs       │ 377.9 µs          │ 13%     │
├───────────────────┼────────────────┼───────────────────┼─────────┤
│ n=100000/len=1024 │ 1.16 ms        │ 1.34 ms           │ 14%     │
└───────────────────┴────────────────┴───────────────────┴─────────┘

I think this is still worthwhile to be added. Let me know what the community thinks!

What changes are included in this PR?

A new public API.

Are these changes tested?

Yes!

Are there any user-facing changes?

A new public API.

I noticed that this method is available on PrimitiveTypeBuilder, but missing on the GenericByteBuilder, which make sense since the gain is less, but after benchmarking, it shows a solid 10%: ``` ┌───────────────────┬────────────────┬───────────────────┬─────────┐ │ Benchmark │ append_value_n │ append_value loop │ Speedup │ ├───────────────────┼────────────────┼───────────────────┼─────────┤ │ n=100/len=5 │ 371 ns │ 408 ns │ 10% │ ├───────────────────┼────────────────┼───────────────────┼─────────┤ │ n=100/len=30 │ 456 ns │ 507 ns │ 10% │ ├───────────────────┼────────────────┼───────────────────┼─────────┤ │ n=100/len=1024 │ 1.81 µs │ 1.95 µs │ 8% │ ├───────────────────┼────────────────┼───────────────────┼─────────┤ │ n=1000/len=5 │ 2.39 µs │ 2.87 µs │ 17% │ ├───────────────────┼────────────────┼───────────────────┼─────────┤ │ n=1000/len=30 │ 3.41 µs │ 3.89 µs │ 12% │ ├───────────────────┼────────────────┼───────────────────┼─────────┤ │ n=1000/len=1024 │ 12.3 µs │ 14.4 µs │ 15% │ ├───────────────────┼────────────────┼───────────────────┼─────────┤ │ n=10000/len=5 │ 23.8 µs │ 29.3 µs │ 19% │ ├───────────────────┼────────────────┼───────────────────┼─────────┤ │ n=10000/len=30 │ 33.7 µs │ 39.0 µs │ 14% │ ├───────────────────┼────────────────┼───────────────────┼─────────┤ │ n=10000/len=1024 │ 115.9 µs │ 135.0 µs │ 14% │ ├───────────────────┼────────────────┼───────────────────┼─────────┤ │ n=100000/len=5 │ 227.5 µs │ 278.6 µs │ 18% │ ├───────────────────┼────────────────┼───────────────────┼─────────┤ │ n=100000/len=30 │ 328.1 µs │ 377.9 µs │ 13% │ ├───────────────────┼────────────────┼───────────────────┼─────────┤ │ n=100000/len=1024 │ 1.16 ms │ 1.34 ms │ 14% │ └───────────────────┴────────────────┴───────────────────┴─────────┘ ``` I think this is still worthwhile to be added. Let me know what the community thinks!

emkornfield · 2026-02-18T18:31:32Z

+    pub fn append_value_n(&mut self, value: impl AsRef<T::Native>, n: usize) {
+        let bytes = value.as_ref().as_ref();
+        for _ in 0..n {
+            self.value_builder.extend_from_slice(bytes);


are there methods can we can use to reserve the capacity for value_builder and offsets_builder ahead of time

(reserve is potentially dangerous performance wise if it doesn't do amortized allocations, or we would need to do the amortization here.).

Is that directly related to this PR? Or a more general observation?

AFAIK, rust container allocations are amortized, e.g. Vec docs state:

Vec does not guarantee any particular growth strategy when reallocating when full, nor when reserve is called. The current strategy is basic and it may prove desirable to use a non-constant growth factor. Whatever strategy is used will of course guarantee O(1) amortized push.

So hopefully that's enough to avoid quadratic silliness even if we don't reserve, leaving only constant factors on the table?

I didn't look closely at Rust, but from c++ there could be a potential footgun (copied the reference below). So calling append_n with n=1 in a loop was my concern if we ended up using reserve. It sounds like Rust might might make stronger guarantees (but there is wiggle room)

Correctly using reserve() can prevent unnecessary reallocations, but inappropriate uses of reserve() (for instance, calling it before every push_back() call) may actually increase the number of reallocations (by causing the capacity to grow linearly rather than exponentially) and result in increased computational complexity and decreased performance. For example, a function that receives an arbitrary vector by reference and appends elements to it should usually not call reserve() on the vector, since it does not know of the vector's usage characteristics.

Or put another way, callers that care about it can probably call reserve themselves externally

You can reserve capacity by calling with_capacity:

arrow-rs/arrow-array/src/builder/generic_bytes_builder.rs

Lines 43 to 57 in 2f40f78

/// Creates a new [`GenericByteBuilder`].

///

/// - `item_capacity` is the number of items to pre-allocate.

/// The size of the preallocated buffer of offsets is the number of items plus one.

/// - `data_capacity` is the total number of bytes of data to pre-allocate

/// (for all items, not per item).

pub fn with_capacity(item_capacity: usize, data_capacity: usize) -> Self {

let mut offsets_builder = Vec::with_capacity(item_capacity + 1);

offsets_builder.push(T::Offset::from_usize(0).unwrap());

Self {

value_builder: Vec::with_capacity(data_capacity),

offsets_builder,

null_buffer_builder: NullBufferBuilder::new(item_capacity),

}

}

We already do this in our code.

OK, so lets avoid adding it here. It seems better placed in caller code.

As this is append_value_n I believe it makes sense to add a reserve.
While the builder provides with_capacity - the primary use case for using a builder is when you don't know the capacity upfront (in most other cases you can build the array from iterator or slice directly and skipping the builder overhead).

My friend Claude created a benchmark:

// Licensed to the Apache Software Foundation (ASF) under one // or more contributor license agreements. See the NOTICE file // distributed with this work for additional information // regarding copyright ownership. The ASF licenses this file // to you under the Apache License, Version 2.0 (the // "License"); you may not use this file except in compliance // with the License. You may obtain a copy of the License at // // http://www.apache.org/licenses/LICENSE-2.0 // // Unless required by applicable law or agreed to in writing, // software distributed under the License is distributed on an // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY // KIND, either express or implied. See the License for the // specific language governing permissions and limitations // under the License. use arrow_array::builder::StringBuilder; use criterion::*; use std::hint; fn bench_append_value(c: &mut Criterion) { let mut group = c.benchmark_group("append_value"); for &str_len in &[5, 30, 1024] { let value = "x".repeat(str_len); for &n in &[100, 1000, 10000, 100000] { group.throughput(Throughput::Elements(n as u64)); group.bench_with_input( BenchmarkId::new(format!("n={n}"), format!("len={str_len}")), &(&value, n), |b, &(value, n)| { b.iter(|| { let mut builder = StringBuilder::new(); for _ in 0..n { builder.append_value(value); } hint::black_box(builder.finish()); }) }, ); } } group.finish(); } fn bench_append_value_n(c: &mut Criterion) { let mut group = c.benchmark_group("append_value_n"); for &str_len in &[5, 30, 1024] { let value = "x".repeat(str_len); for &n in &[100, 1000, 10000, 100000] { group.throughput(Throughput::Elements(n as u64)); group.bench_with_input( BenchmarkId::new(format!("n={n}"), format!("len={str_len}")), &(&value, n), |b, &(value, n)| { b.iter(|| { let mut builder = StringBuilder::new(); builder.append_value_n(value, n); hint::black_box(builder.finish()); }) }, ); } } group.finish(); } criterion_group!(benches, bench_append_value, bench_append_value_n); criterion_main!(benches);

Let me know if you find this valuable to add to the repository.

This resulted in:

append_value_n improvements with reserve(): ┌────────────────────┬─────────┬─────────┬─────────────┐ │ Config │ Before │ After │ Improvement │ ├────────────────────┼─────────┼─────────┼─────────────┤ │ n=1000, len=5 │ 2.96 µs │ 2.68 µs │ ~9% │ ├────────────────────┼─────────┼─────────┼─────────────┤ │ n=10000, len=5 │ 29.3 µs │ 26.4 µs │ ~10% │ ├────────────────────┼─────────┼─────────┼─────────────┤ │ n=100000, len=5 │ 268 µs │ 249 µs │ ~7% │ ├────────────────────┼─────────┼─────────┼─────────────┤ │ n=100, len=30 │ 628 ns │ 548 ns │ ~13% │ ├────────────────────┼─────────┼─────────┼─────────────┤ │ n=1000, len=30 │ 4.22 µs │ 3.68 µs │ ~13% │ ├────────────────────┼─────────┼─────────┼─────────────┤ │ n=10000, len=30 │ 43.9 µs │ 36.3 µs │ ~17% │ ├────────────────────┼─────────┼─────────┼─────────────┤ │ n=100000, len=30 │ 404 µs │ 350 µs │ ~13% │ ├────────────────────┼─────────┼─────────┼─────────────┤ │ n=100, len=1024 │ 3.16 µs │ 1.93 µs │ ~39% │ ├────────────────────┼─────────┼─────────┼─────────────┤ │ n=1000, len=1024 │ 13.7 µs │ 11.1 µs │ ~19% │ ├────────────────────┼─────────┼─────────┼─────────────┤ │ n=10000, len=1024 │ 729 µs │ 103 µs │ ~86% │ ├────────────────────┼─────────┼─────────┼─────────────┤ │ n=100000, len=1024 │ 8.68 ms │ 1.02 ms │ ~88% │ └────────────────────┴─────────┴─────────┴─────────────┘

Keep in mind that the benchmark uses ::new rather than ::with_capacity. We can see that .reserve offers some benefits if you don't pre-allocate.

scovich

LGTM

scovich · 2026-02-20T03:44:04Z

+    pub fn append_value_n(&mut self, value: impl AsRef<T::Native>, n: usize) {
+        let bytes = value.as_ref().as_ref();
+        for _ in 0..n {
+            self.value_builder.extend_from_slice(bytes);


Is that directly related to this PR? Or a more general observation?

AFAIK, rust container allocations are amortized, e.g. Vec docs state:

Vec does not guarantee any particular growth strategy when reallocating when full, nor when reserve is called. The current strategy is basic and it may prove desirable to use a non-constant growth factor. Whatever strategy is used will of course guarantee O(1) amortized push.

So hopefully that's enough to avoid quadratic silliness even if we don't reserve, leaving only constant factors on the table?

scovich · 2026-02-20T03:54:15Z

Out of curiosity, are there any benchmarks in the repo that show these improvements?

Fokko · 2026-02-20T08:31:09Z

Out of curiosity, are there any benchmarks in the repo that show these improvements?

I've asked by friend Claude to generate a microbenchmark, where the result can be found in the PR description. I can commit the benchmark as well, but I figured that would generate a lot of very specific benchmarks which won't be used that much, I guess.

Fokko · 2026-02-20T08:31:50Z

+            self.value_builder.extend_from_slice(bytes);
+            self.offsets_builder.push(self.next_offset());
+        }
+        self.null_buffer_builder.append_n_non_nulls(n);


Pulling this out of the loop gives the 10-20% speedup.

Dandandan · 2026-02-20T09:20:51Z

+        let bytes = value.as_ref().as_ref();
+        for _ in 0..n {
+            self.value_builder.extend_from_slice(bytes);
+            self.offsets_builder.push(self.next_offset());


If you want to make it faster:
self.offsets_builder.append_trusted_len_iter( could be used outside of the loop with an offset iterator.

IMO, also a reserve should be added here as this is a builder API - it makes not really sense in most cases to use a builder if you know the capacity upfront (you can build the array directly, which will be faster anyway.

scovich · 2026-02-27T05:45:49Z

@Dandandan -- any reason not to merge this now that the reserve calls were added?

github-actions Bot added the arrow Changes to the arrow crate label Feb 18, 2026

Fokko mentioned this pull request Feb 18, 2026

chore: Prefer append_value_n over append_value delta-io/delta-kernel-rs#1868

Merged

emkornfield reviewed Feb 18, 2026

View reviewed changes

scovich approved these changes Feb 20, 2026

View reviewed changes

Merge branch 'main' into fd-append

a92fd79

Fokko commented Feb 20, 2026

View reviewed changes

Dandandan reviewed Feb 20, 2026

View reviewed changes

Fokko added 5 commits February 20, 2026 12:23

Add .reserve

cdcfb61

Merge branch 'fd-append' of github.com:Fokko/arrow-rs into fd-append

8d5f2ac

Merge branch 'main' into fd-append

c93e0b3

Merge branch 'main' into fd-append

3fa6378

Merge branch 'main' into fd-append

f3f658f

scovich merged commit 01d34a8 into apache:main Mar 2, 2026
26 checks passed

alamb mentioned this pull request Mar 18, 2026

Add append_value_n to GenericByteBuilder #9425

Closed

kszucs mentioned this pull request Apr 1, 2026

fix(parquet): fix CDC panic on nested ListArrays with null entries kszucs/arrow-rs#2

Closed

	/// Creates a new [`GenericByteBuilder`].
	///
	/// - `item_capacity` is the number of items to pre-allocate.
	/// The size of the preallocated buffer of offsets is the number of items plus one.
	/// - `data_capacity` is the total number of bytes of data to pre-allocate
	/// (for all items, not per item).
	pub fn with_capacity(item_capacity: usize, data_capacity: usize) -> Self {
	let mut offsets_builder = Vec::with_capacity(item_capacity + 1);
	offsets_builder.push(T::Offset::from_usize(0).unwrap());
	Self {
	value_builder: Vec::with_capacity(data_capacity),
	offsets_builder,
	null_buffer_builder: NullBufferBuilder::new(item_capacity),
	}
	}

Conversation

Fokko commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Dandandan Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Fokko Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scovich left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scovich commented Feb 20, 2026

Uh oh!

Fokko commented Feb 20, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scovich commented Feb 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fokko commented Feb 18, 2026 •

edited

Loading

Dandandan Feb 20, 2026 •

edited

Loading

Fokko Feb 20, 2026 •

edited

Loading