ARROW-13358, ARROW-14167, ARROW-13573, etc. add support for dictionary types in various generic 'selection'-style kernels. However, the support is fairly unoptimized, repeatedly hashing and inserting the same values. Instead, we could implement more optimized support in some cases.
For example, for if_else, we could first unify both dictionaries (in such a way that the first dictionary is unchanged), then copy-with-transpose the indices of the second argument, then copy the relevant indices of the first argument (which doesn't require transposition). This would require DictionaryUnifier to support nulls in dictionaries, however.
Reporter: David Li / @lidavidm
Related issues:
Note: This issue was originally created as ARROW-14177. Please see the migration documentation for further details.
ARROW-13358, ARROW-14167, ARROW-13573, etc. add support for dictionary types in various generic 'selection'-style kernels. However, the support is fairly unoptimized, repeatedly hashing and inserting the same values. Instead, we could implement more optimized support in some cases.
For example, for if_else, we could first unify both dictionaries (in such a way that the first dictionary is unchanged), then copy-with-transpose the indices of the second argument, then copy the relevant indices of the first argument (which doesn't require transposition). This would require DictionaryUnifier to support nulls in dictionaries, however.
Reporter: David Li / @lidavidm
Related issues:
Note: This issue was originally created as ARROW-14177. Please see the migration documentation for further details.