These days, there are more tools to play with benchmarks. No doubts, its really hard to do especially if they are micro. Let's cheers up an JMH. Really pretty good and easy to use framework for a writing the micro benchmarks for a JVM. To play with, I decided to come back to my old investigation about the Spring proxying, I did some time ago. But now, the list of targets are extended and tests are more representative.
So, the purpose was to compare the cost of proxying the POJOs by different instruments widely used nova days. The list from the original was slightly extended so its looks like:
- Direct call - just an explicit call, for comparison purposes, o.k.e.CatBenchmark.t1_direct,
- Generated bytecode (JavAssist) - "manually" generated proxy with aids of JavAssist, o.k.e.CatBenchmark.t2_javassist,
- Generated bytecode (CGLib) - the same, except CGlib, o.k.e.CatBenchmark.t3_cglib,
- JDK's Dynamic proxy - proxy provided by default JDK's dynamic proxy, o.k.e.CatBenchmark.t4_dynamic,
- Hibernate proxy - proxy provided by Hibernate, o.k.e.CatBenchmark.t5_hibernate,
- Spring proxy (CGLib) - proxy, provided by Spring, o.k.e.CatBenchmark.t6_springOpt,
- Spring proxy (Dynamic) - "optimized" proxy, provided by Spring, o.k.e.CatBenchmark.t7_spring,
Here we need to make a couple remarks: a) Used framework allows achieve pretty stable results that, I think, could be trusted. Measurements were made for different "load", from simplest one, execution of which was comparable to the cost of proxying, to more complex case, where cost was amortized by weight of calculations (i.e. calculations took significantly more time then time spent on pure proxying). b) The version of competitors where up to dated to be more actual.
The code is available for inspecting here. Any notes / comments are welcome. :)
So, the 1st experiment was about proxying the hashing of the "I said meow!" string.
The result on my old laptop under JDK 1.8 u20 looks like:
Benchmark Mode Samples Score Score error Units o.k.e.CatBenchmark.t1_direct avgt 30 9.762 0.047 ns/op o.k.e.CatBenchmark.t2_javassist avgt 30 10.268 0.035 ns/op o.k.e.CatBenchmark.t3_cglib avgt 30 10.643 0.050 ns/op o.k.e.CatBenchmark.t4_dynamic avgt 30 16.760 0.136 ns/op o.k.e.CatBenchmark.t5_hibernate avgt 30 25.991 0.167 ns/op o.k.e.CatBenchmark.t6_springOpt avgt 30 27.275 0.150 ns/op o.k.e.CatBenchmark.t7_spring avgt 30 53.929 0.423 ns/op
For a self check, lets look at the border case, the Spring's dynamic proxy. Its cost, as we may see is 53.929 - 9.762 = 44.176±0.47 ns. Unexpectedly good looking the CGLib, its stands not so far from explicit invocation.
The hash calculation is pretty straight forward. Just iterating over character with some magic of prime numbers. So, the next step was to increase the number of iterations in 10 times:
Benchmark Mode Samples Score Score error Units o.k.e.CatBenchmark.t1_direct avgt 30 137.933 0.417 ns/op o.k.e.CatBenchmark.t2_javassist avgt 30 138.760 0.508 ns/op o.k.e.CatBenchmark.t3_cglib avgt 30 139.164 0.389 ns/op o.k.e.CatBenchmark.t4_dynamic avgt 30 147.382 0.655 ns/op o.k.e.CatBenchmark.t5_hibernate avgt 30 157.001 0.828 ns/op o.k.e.CatBenchmark.t6_springOpt avgt 30 158.625 0.759 ns/op o.k.e.CatBenchmark.t7_spring avgt 30 184.689 0.988 ns/op
And we may see, the time per operations is really about 10 times more. And observed cost is still reasonable: 46.756±1.405 ns.
And last case is about relatively heavyweight calculation of CRC32 of string from previous experiment:
Benchmark Mode Samples Score Score error Units o.k.e.CatBenchmark.t1_direct avgt 30 207.325 1.435 ns/op o.k.e.CatBenchmark.t2_javassist avgt 30 206.799 1.581 ns/op o.k.e.CatBenchmark.t3_cglib avgt 30 208.403 1.069 ns/op o.k.e.CatBenchmark.t4_dynamic avgt 30 217.890 1.144 ns/op o.k.e.CatBenchmark.t5_hibernate avgt 30 233.030 2.577 ns/op o.k.e.CatBenchmark.t6_springOpt avgt 30 231.929 1.956 ns/op o.k.e.CatBenchmark.t7_spring avgt 30 259.991 2.291 ns/op
With cost: 52.666±3.726 ns.
So, the worst case, Spring's proxy based on dynamic proxy costs ~50 ns. No so bad, right? Last time it was in twice as bigger, there is definite progress between Spring v. 2.5.6 and v. 4.0.6. But that is just a proxying a call... We didn't consider losses on bunch of JIT optimisations that just impossible to do on semi-reflective call wrapping implied with CGLib usage. And this is only 1 call. Modern applications uses tons of Spring config files, where interconnections may build the whole tree of proxy instances.
P.S. Was pointed to the nice explanation of "black hole". Definitely more "realistic" tool to measure such kind of experiment.
No comments:
Post a Comment