Brian Clozel opened SPR-17250 and commented
In a recent reactor netty issue, small apps are being used to benchmark different frameworks. While working on that, Violeta Georgieva found a few interesting points we could take a look at, as opportunities for performance improvements and micro-benchmark cases.
Hotspots
The benchmark has underlined a few hostpots in the Spring Framework codebase:
org.springframework.util.LinkedCaseInsensitiveMap.convertKey(String) (own time, 5th overall)
org.springframework.util.MimeType.<init>(String, String, Map) (own time, 10th overall)
LinkedCaseInsensitiveMap
The convertKey method is used evenly by get and put operations on the map itself. This is calling String.toLowerCase many times, which involves a complex logic (it depends on the Locale) and creates new String instances. In general, copying headers from the native request and looking them up in our data structure is not efficient.
Request Content-Type parsing
HttpHeaders.getContentType is the main offender for get operations, called in several places:
org.springframework.web.server.adapter.DefaultServerWebExchange.initFormData(ServerHttpRequest, ServerCodecConfigurer)
org.springframework.web.reactive.function.BodyExtractors.contentType(HttpMessage)
org.springframework.web.server.adapter.DefaultServerWebExchange.initMultipartData(ServerHttpRequest, ServerCodecConfigurer)
org.springframework.http.codec.DecoderHttpMessageReader.getContentType(HttpMessage)
- and more
In general, we're spending a lot of time on org.springframework.util.MimeType.<init>(String, String, Map) and org.springframework.http.MediaType.parseMediaType(String) when parsing the content type of the request.
Possible solutions
We could use a more optimized data structure behind HttpHeaders. A data structure that uses equalsIgnoreCase or even case insensitive hashcode should be better. Many server implementations don't even use Map implementations, given that requests often contain just a few headers and the cost of setting up a Map outweighs the lookup performance gains.
Benchmarks show that even a new TreeMap<String, String>(String.CASE_INSENSITIVE_ORDER) shows better performance.
Other implementations tend to wrap the String instances to store their case insensitive hash internally and use that for comparisons.
For Spring WebFlux, we could even wrap the native HTTP request headers and never copy those values in other data structures. This requires dedicated implementations for each server (Tomcat, Jetty, Reactor Netty and Undertow), but since most of those are using pooled resources for headers, we might improve a lot the GC pressure applied for each HTTP exchange handling.
Microbenchmark results
Here are microbenchmark results on HttpHeaders variants, no other Spring infrastructure piece involved.
The "GET requests" case is reading a few headers from the request; the "POST requests" case is reading multiple times the "Content-Type" and "Content-Length" headers, as Spring does.
The baseline is just performing those read operations from an existing map. the other variants are doing what Spring is supposed to do, treating that map as the native request headers. In the regular HttpHeaders case, we're copying headers into our instance and performing the lookup operations. Other variants are trying the same with other map implementations.
This shows that the copying/allocations are using resources and changing the underlying implementation only slightly improves performance. This means we should instead look into leveraging the native headers directly and avoid copying in the first place.
Benchmark Mode Cnt Score Error Units
1. GET requests
MyBenchmark.baselineGetRequest thrpt 5 890408.644 ± 25882.489 ops/s
1. current HttpHeaders implementation
MyBenchmark.httpHeadersGetRequest thrpt 5 632354.108 ± 9054.069 ops/s
1. current implementation, but with a shortcut for linked map lookups
MyBenchmark.fixedCaseInsensitiveHeadersGetRequest thrpt 5 671559.477 ± 22871.242 ops/s
1. same thing, but backed by a case insensitive TreeMap
MyBenchmark.treeMapHttpHeadersGetRequest thrpt 5 734720.620 ± 17370.468 ops/s
1. same thing, but backed by a dedicated map implementation
MyBenchmark.headersMapHttpHeadersGetRequest thrpt 5 695399.245 ± 24594.272 ops/s
1. POST requests
1. baseline benchmark
MyBenchmark.baselinePostRequest thrpt 5 8672650.691 ± 135878.532 ops/s
1. current HttpHeaders implementation
MyBenchmark.httpHeadersPostRequest thrpt 5 994670.287 ± 19372.729 ops/s
1. current implementation, but with a shortcut for linked map lookups
MyBenchmark.fixedCaseInsensitiveHttpHeadersPostRequest thrpt 5 1058808.241 ± 131617.683 ops/s
1. same thing, but backed by a case insensitive TreeMap
MyBenchmark.treeMapHttpHeadersPostRequest thrpt 5 1525237.824 ± 27530.672 ops/s
1. current implementation, but caching the content type once resolved
MyBenchmark.httpHeadersContentTypeCachePostRequest thrpt 5 1584504.796 ± 65863.946 ops/s
1. caching the content type + backed by a dedicated map implementation
MyBenchmark.optimizedHttpHeadersPostRequest thrpt 5 2310686.814 ± 93787.128 ops/s
Affects: 5.1 RC2
Reference URL: reactor/reactor-netty#392
Issue Links:
Referenced from: commits 10d5de7, 58b3af9, f12c28e, ce7278a
0 votes, 7 watchers
Brian Clozel opened SPR-17250 and commented
In a recent reactor netty issue, small apps are being used to benchmark different frameworks. While working on that, Violeta Georgieva found a few interesting points we could take a look at, as opportunities for performance improvements and micro-benchmark cases.
Hotspots
The benchmark has underlined a few hostpots in the Spring Framework codebase:
org.springframework.util.LinkedCaseInsensitiveMap.convertKey(String)(own time, 5th overall)org.springframework.util.MimeType.<init>(String, String, Map)(own time, 10th overall)LinkedCaseInsensitiveMap
The
convertKeymethod is used evenly bygetandputoperations on the map itself. This is callingString.toLowerCasemany times, which involves a complex logic (it depends on theLocale) and creates newStringinstances. In general, copying headers from the native request and looking them up in our data structure is not efficient.Request Content-Type parsing
HttpHeaders.getContentTypeis the main offender forgetoperations, called in several places:org.springframework.web.server.adapter.DefaultServerWebExchange.initFormData(ServerHttpRequest, ServerCodecConfigurer)org.springframework.web.reactive.function.BodyExtractors.contentType(HttpMessage)org.springframework.web.server.adapter.DefaultServerWebExchange.initMultipartData(ServerHttpRequest, ServerCodecConfigurer)org.springframework.http.codec.DecoderHttpMessageReader.getContentType(HttpMessage)In general, we're spending a lot of time on
org.springframework.util.MimeType.<init>(String, String, Map)andorg.springframework.http.MediaType.parseMediaType(String)when parsing the content type of the request.Possible solutions
We could use a more optimized data structure behind
HttpHeaders. A data structure that usesequalsIgnoreCaseor even case insensitivehashcodeshould be better. Many server implementations don't even useMapimplementations, given that requests often contain just a few headers and the cost of setting up aMapoutweighs the lookup performance gains.Benchmarks show that even a
new TreeMap<String, String>(String.CASE_INSENSITIVE_ORDER)shows better performance.Other implementations tend to wrap the
Stringinstances to store their case insensitive hash internally and use that for comparisons.For Spring WebFlux, we could even wrap the native HTTP request headers and never copy those values in other data structures. This requires dedicated implementations for each server (Tomcat, Jetty, Reactor Netty and Undertow), but since most of those are using pooled resources for headers, we might improve a lot the GC pressure applied for each HTTP exchange handling.
Microbenchmark results
Here are microbenchmark results on
HttpHeadersvariants, no other Spring infrastructure piece involved.The "GET requests" case is reading a few headers from the request; the "POST requests" case is reading multiple times the "Content-Type" and "Content-Length" headers, as Spring does.
The baseline is just performing those read operations from an existing map. the other variants are doing what Spring is supposed to do, treating that map as the native request headers. In the regular
HttpHeaderscase, we're copying headers into our instance and performing the lookup operations. Other variants are trying the same with other map implementations.This shows that the copying/allocations are using resources and changing the underlying implementation only slightly improves performance. This means we should instead look into leveraging the native headers directly and avoid copying in the first place.
Affects: 5.1 RC2
Reference URL: reactor/reactor-netty#392
Issue Links:
Referenced from: commits 10d5de7, 58b3af9, f12c28e, ce7278a
0 votes, 7 watchers