Describe the enhancement requested
Optimisation to #38333
Child of #18014
Currently ObjectAppendStream::DoAppend calls block_blob_client_->StageBlock synchronously meaning that the call to ObjectAppendStream::DoAppend blocks until the data has been successfully written to blob storage. This is very in-efficient for large numbers of small writes.
This performance problem is actually quite obvious just in small tests against azurite. The UploadLines function used to create test data uses std::accumulate and writes the data in one call for performance reasons.
With accumulate
[ RUN ] TestAzuriteFileSystem.OpenInputFileMixedReadVsReadAt
[ OK ] TestAzuriteFileSystem.OpenInputFileMixedReadVsReadAt (1350 ms)
without accumulate (4096 separate calls to ObjectAppendStream::DoAppend).
[ RUN ] TestAzuriteFileSystem.OpenInputFileMixedReadVsReadAt
[ OK ] TestAzuriteFileSystem.OpenInputFileMixedReadVsReadAt (25124 ms)
And this is when testing against azurite on localhost so against real blob storage where the latency is going to be much higher the problem will be exacerbated.
By comparison the GCS filesystem is able to handle the later approach without performance issues.
Some options to optimise:
- Call
block_blob_client_->StageBlock asynchronously and await all the futures in ObjectAppendStream::Flush.
- Buffer small writes in memory and make fewer larger calls to
block_blob_client_->StageBlock.
- Buffer small writes in memory and make batched calls to
block_blob_client_->StageBlock.
Component(s)
C++
Describe the enhancement requested
Optimisation to #38333
Child of #18014
Currently
ObjectAppendStream::DoAppendcallsblock_blob_client_->StageBlocksynchronously meaning that the call toObjectAppendStream::DoAppendblocks until the data has been successfully written to blob storage. This is very in-efficient for large numbers of small writes.This performance problem is actually quite obvious just in small tests against azurite. The
UploadLinesfunction used to create test data usesstd::accumulateand writes the data in one call for performance reasons.With accumulate
without accumulate (4096 separate calls to
ObjectAppendStream::DoAppend).And this is when testing against
azuriteon localhost so against real blob storage where the latency is going to be much higher the problem will be exacerbated.By comparison the GCS filesystem is able to handle the later approach without performance issues.
Some options to optimise:
block_blob_client_->StageBlockasynchronously and await all the futures inObjectAppendStream::Flush.block_blob_client_->StageBlock.block_blob_client_->StageBlock.Component(s)
C++