Fix reading encrypted Parquet pages when using the page index#7633
Fix reading encrypted Parquet pages when using the page index#7633adamreeve merged 10 commits intoapache:mainfrom
Conversation
rok
left a comment
There was a problem hiding this comment.
Nice that #[cfg(feature = "encryption")] macros are now enabling full functions. page_ordinal -> page_index is great for consistency too. I'm cringing a little bit that index and skipping was broken, sorry about that.
No comments for changes.
|
Thanks @adamreeve! This looks nice as is. I left you a question in the form of a PR against your branch. |
take a stab at further refactoring
etseidl
left a comment
There was a problem hiding this comment.
Looks good, thanks @adamreeve!
| // If the offset of the first page doesn't match the start of the column chunk | ||
| // then the preceding space must contain a dictionary page. |
|
Would you like to review this too @alamb or are you happy for me to merge? |
Unless you have something specific you would like me to review, I am very happy to have you merge it -- I think @etseidl 's review is more than adequate. Thank you for checking. BTW in this repo we just hit the green button on the github UI (no extra merge scripts, etc) |
OK great, thanks. I didn't have anything specific in mind but just wasn't sure about the process and how widely my Arrow committer privileges should be used :) |
Which issue does this PR close?
Closes #7629.
I also noticed that skipping pages in encrypted files was broken so have fixed that too.
What changes are included in this PR?
SerializedPageReaderto reduce the use of#[cfg(...)]inline. To work with the borrow checker, I created a newSerializedPageReaderContexttype to hold theCryptoContext.SerializedPageReader::get_next_pageso that page headers and page data are decrypted when page indexes are used.SerializedPageReader::skip_next_pageto update the page index so that encryption AADs are calculated correctly.Are there any user-facing changes?
Only bug fixes.