parquet: rle skip decode loop when batch contains all max levels (aka no nulls)#9258
parquet: rle skip decode loop when batch contains all max levels (aka no nulls)#9258alamb merged 2 commits intoapache:mainfrom
Conversation
24968e2 to
58bfbcc
Compare
58bfbcc to
f5a1966
Compare
|
run benchmark arrow_reader |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
run benchmark arrow_reader |
|
🤖 |
alamb
left a comment
There was a problem hiding this comment.
These look like very promising results -- thank you @lyang24 🙏
I studied this fairly carefully and it makes sense to me (and it made sense to codex too). I'll study the performance results and assuming those look good I think this one is good to merge in
cc @tustvold and @Dandandan in case you have time to help review this
| /// Returns `Ok(None)` if there are any nulls or packed data that prevents fast path. | ||
| /// | ||
| /// On success, advances decoder state. On failure, state is unchanged. | ||
| fn try_skip_all_valid(&mut self, len: usize) -> Result<Option<usize>> { |
There was a problem hiding this comment.
I found this name confusing as it doesn't really "skip" the levels (they are returned in the count)
What would you think about renaming this try_consume_all_valid instead?
There was a problem hiding this comment.
yep try_consume_all_valid fits better i have updated it in the second commit
| /// Returns `Ok(Some(count))` if successfully consumed `count` all-valid levels. | ||
| /// Returns `Ok(None)` if there are any nulls or packed data that prevents fast path. | ||
| /// | ||
| /// On success, advances decoder state. On failure, state is unchanged. |
There was a problem hiding this comment.
technically speaking the state is changed (block can be advanced) but that only happens if rle_left was zero (and hence it would have been loaded anyways)
There was a problem hiding this comment.
good call updated comments
|
🤖: Benchmark completed Details
|
alamb
left a comment
There was a problem hiding this comment.
I think this is a really nice find
|
Let's do it (I have been procrastinating because I have some nagging feeling that there is a corner case here or something) However, I spent quite a while with codex trying to tease a bug out and I could no Thanks again @lyang24 |
Which issue does this PR close?
Rationale for this change
parquet reading perf - if rle value is true and rle left have enough room for the current batch. lets skip the decode loop the overhead of count_set_bits for null bitmap.
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?