Right now, there is no way to represent a non-nullable, "degenerate" FixedSizeListArray with arbitrary length. By degenerate, I mean that the size of each list scalar is equal to 0, meaning the FixedSizeList does not actually store any data.
The try_new constructor will use the length of the null buffer to determine the length of the array in the degenerate case, but if the null buffer is None (indicating a non-nullable array), then the code incorrectly defaults to 0.
|
let len = match s { |
|
0 => nulls.as_ref().map(|x| x.len()).unwrap_or_default(), |
Possible Solutions
You cannot correctly handle this case by relying on the length of the null buffer, so I believe there are only 2 sane solutions. The first is to fix the try_new constructor and require it to take in a len: usize.
Since a breaking change like this might not be acceptable, then a try_new_with_length constructor can be made that does take in a len parameter, and the existing try_new can call into try_new_with_length while documenting that if users want this specific degenerate and non-nullable case, they must use try_new_with_length.
Edit: I've made a first step in #8624
Edit: I also just realized that if a caller passes in a values array that is not an exact multiple of size to try_new, we immediately have an incorrect FixedSizeList. So if I pass in an array of length 6, and I specify an list size of 4 and a single null in my null buffer, that creates an invalid(?) array (though technically you can never access those last two values?)
Related Issues
Right now, there is no way to represent a non-nullable, "degenerate"
FixedSizeListArraywith arbitrary length. By degenerate, I mean that thesizeof each list scalar is equal to 0, meaning theFixedSizeListdoes not actually store any data.The
try_newconstructor will use the length of the null buffer to determine the length of the array in the degenerate case, but if the null buffer isNone(indicating a non-nullable array), then the code incorrectly defaults to 0.arrow-rs/arrow-array/src/array/fixed_size_list_array.rs
Lines 155 to 156 in 751b082
Possible Solutions
You cannot correctly handle this case by relying on the length of the null buffer, so I believe there are only 2 sane solutions. The first is to fix the
try_newconstructor and require it to take in alen: usize.Since a breaking change like this might not be acceptable, then a
try_new_with_lengthconstructor can be made that does take in alenparameter, and the existingtry_newcan call intotry_new_with_lengthwhile documenting that if users want this specific degenerate and non-nullable case, they must usetry_new_with_length.Edit: I've made a first step in #8624
Edit: I also just realized that if a caller passes in a
valuesarray that is not an exact multiple ofsizetotry_new, we immediately have an incorrectFixedSizeList. So if I pass in an array of length 6, and I specify an list size of 4 and a single null in my null buffer, that creates an invalid(?) array (though technically you can never access those last two values?)Related Issues