New hash table#1186
Merged
zuiderkwast merged 3 commits intounstablefrom Dec 10, 2024
Merged
Conversation
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR implements a new hash table and uses it for keys, command lookup and more. There are multiple commits. Do not squash-merge.
Hash table design
The hash table is a cache line optimized and implemented as outlined in #169, but changed to chaining instead of probing after an idea by Madelyn. The key-value entry is user-defined, which allows the user to embed key and value within a single allocation. The hash table supports incremental rehashing, scan, random key, etc. just like the dict but faster and using less memory. Each bucket contains a few bits of metadata per entry. For details, see the comments in
src/hashtable.{c,h}.If a bucket is full, the last entry pointer in a bucket can be replaced by a child-bucket pointer and we get a bucket chain.
Command lookup
The 2nd commit relaces dict with hashtable for command lookup. This was implemented by @SoftlyRaining.
Keys and expire
The 3rd commit replaces dict with the new hash table in kvstore.c and all code that uses it, such as db.c.
The hashtable entry in this case is the robj struct. The key and optionally an expire timestamp are embedded in the
robjstruct, i.e. the key is embedded in the value. Therefore, we can call this a valkey object, val + key. This design saves roughly 20 bytes per key for short string keys.Some db.c functions like
dbAdd,setKeyandsetExpirenow reallocate the value object to embed the key and optional expire in it.setKeydoes not increment the reference counter, since it would require duplicating the object.Fixes #991
Fixes #992