Specification
To optimise reading out the closes nodes to a target node we need to apply some improvements.
The getClosestNode function needs to take a nodeId and limit as parameters. The nodeId is the node we're calculating the distance relative to. The limit is how many nodes we wish to return. The limit defaults to the nodeBucketLimit as per the Kademlia spec.
We need to avoid reading out all of the buckets and iterating over empty buckets. This can be achieved by using a readStream over the nodeGraphBucketsDb level. This level contains sub levels for each bucket. Each sub level contains the nodeId:nodeInfo key:value pairs. Using the nodeGraphBucketsDb level we can iterate over each stored node in bucket order all at once. Note when setting the gt or lt on the stream we need to start from the desired bucket. In this case the starting point is the bucket 'above' the desired starting bucket. the key we want to start from takes the form of a Buffer with <prefix><higherBucketId><prefix>. Iterating less than this gives us the target bucket plus all lower buckets. Above this is all of the higher buckets.
When we run out of lower buckets we need to iterate over the higher buckets from where we started. If we run into limit while iterating over the nodes we need to get the whole of the last bucket we read. since nodes are out of order within a bucket we need whole buckets to ensure we obtain the closest nodes.
The resulting list is sorted by distance using nodesUtils.bucketSortByDistance and the list is truncated down to the provided limit.
As implemented
We iterate over the nodes directly across the buckets. the nodes are read out in the following order.
- all nodes within the target bucket
N
- nodes in order of bucket 0 to bucket N-1
- nodes in order of bucket N+1 to 255.
When we reach our specified limit we read the whole of the last bucket we've read and add that to the list. we then sort all of the nodes and truncate the list back down to the limit and return that.
Additional context
Tasks
Specification
To optimise reading out the closes nodes to a target node we need to apply some improvements.
The
getClosestNodefunction needs to take anodeIdandlimitas parameters. The nodeId is the node we're calculating the distance relative to. The limit is how many nodes we wish to return. The limit defaults to thenodeBucketLimitas per the Kademlia spec.We need to avoid reading out all of the buckets and iterating over empty buckets. This can be achieved by using a
readStreamover thenodeGraphBucketsDblevel. This level contains sub levels for each bucket. Each sub level contains the nodeId:nodeInfo key:value pairs. Using thenodeGraphBucketsDblevel we can iterate over each stored node in bucket order all at once. Note when setting the gt or lt on the stream we need to start from the desired bucket. In this case the starting point is the bucket 'above' the desired starting bucket. the key we want to start from takes the form of aBufferwith<prefix><higherBucketId><prefix>. Iterating less than this gives us the target bucket plus all lower buckets. Above this is all of the higher buckets.When we run out of lower buckets we need to iterate over the higher buckets from where we started. If we run into limit while iterating over the nodes we need to get the whole of the last bucket we read. since nodes are out of order within a bucket we need whole buckets to ensure we obtain the closest nodes.
The resulting list is sorted by distance using
nodesUtils.bucketSortByDistanceand the list is truncated down to the provided limit.As implemented
We iterate over the nodes directly across the buckets. the nodes are read out in the following order.
NWhen we reach our specified limit we read the whole of the last bucket we've read and add that to the list. we then sort all of the nodes and truncate the list back down to the limit and return that.
Additional context
Tasks
getClosestNodesimplementation to use areadStreamto iterate over each node sequentially across all of the buckets.higherbuckets.