26 October 2015

Big-Oh: Constant vs Logarithmic

- Do you know what is quickest ever method to lookup string?

- Mmmm. Apparently hash table.

- Nope. Dude, its well known that binary search is quicker!

- ?!

Kind of similar to that was a conversation with one guy. (Cheers, Noel!) No doubt, to put it mildly, it sounds crazy: method with logarithmic asymptotic complexity outperform constant one. But it is only on first look. The devil is hiding in details, as always...

From "Big Oh" notation that is commonly encountered to classify algorithms by their execution time, we know that hash tables gives the constant complexity (O(1)) in lookup data by their referential key. And also it's clear that binary search won't give you better performance due it's logarithmic (O(logN)) complexity. It's true, for sure, but it mostly forgotten that all these formulas based on certain computational model, and that model may do a certain corrections that can bring everything up side down. But lets see the real story, it may be more clearer than looks like.

Once I had a luck to participate in interesting project about optimising one a trading platform. On a moment when we started, the point was about milliseconds. Apparently application was designed poorly, implementation was not much better so it was not a big deal to bring it down to hundreds of micros. The real fun began after. Should admit, I had no real practical experience with optimising so demanding Java applications. Yes, I was educated and read something about before. Also I had certain experience in minor tuning of different things, but they doesn't stand beside one we faced then. In other words, as for many at those days, this matter was kind of "black magic" for me.

By its nature the trading platform, give it or take, an application that manages sell and buy orders. But to make an order there must be referential data such as price and volume about products you are trading on, right? That data may fed in crazy frequency and is very sensitive to latency. They must be processed as fast as it is possible. To do so was chosen a hash map that holds a buckets referred by product name, i.e. currency pair like "EUR/USD". Quite reasonable decision, I would say. (Here I should admit a short cut in the story, as originally solution was based on j.u.c.ConcurrentHashMap with dynamical table building. Apparently not so smart decision when product list is not changing in time, right?) And we just stuck there to get it better.

And here happen the conversation mentioned in the blog teaser. I do not know what played a key role: suggestion came from a really smart guy or I was just stuck and happy to peek any ideas? Even so crazy... No matter, we tried and it was really faster. We were too excited that day to analyse why. I did this a bit later.

Really the hash table has the constant complexity of lookup... Imagined we has a perfect hash function to make a short cut. But it doesn't cost "1" as per O(1). To do a lookup in hash table we must calculate the hash of key we are looking for. And when key is a string, it meant a loop over all characters of that string. Bigger key, more time we spend on hash calculation. But if we look at the key we are using, i.e. currency pair name, we may see that in most cases we may distinct which bucket we need by first couple characters of key. Most liquidity providers, unless it is something exotic, trades limited currency pairs, so it wont be a really big list. Leverages this fact we may reduce cost of key comparison so the binary search in key list will do better than lookup in hash table. Simple, isn't?

Recently I was told that these days the only algorithm complexity matters. Nobody cares how much cost has that or another operation. We are having so big computing power that may forget about so small things. May be yes... I am too old and lost my train. :) But look we may get up to 18% just on varying the cost of lookup operation. Just by paying attention to nature of data we processed. Below few numbers:

Benchmark                                (size)  Mode  Cnt   Score   Error  Units
ConstantVsLogarithmicBenchmark.forArray       1  avgt   30  54.385 ± 0.564  ns/op
ConstantVsLogarithmicBenchmark.forArray       3  avgt   30  57.259 ± 0.540  ns/op
ConstantVsLogarithmicBenchmark.forArray       5  avgt   30  61.407 ± 0.825  ns/op
ConstantVsLogarithmicBenchmark.forArray       9  avgt   30  62.473 ± 0.571  ns/op
ConstantVsLogarithmicBenchmark.forMap         1  avgt   30  65.126 ± 0.526  ns/op
ConstantVsLogarithmicBenchmark.forMap         3  avgt   30  65.515 ± 0.668  ns/op
ConstantVsLogarithmicBenchmark.forMap         5  avgt   30  64.683 ± 0.695  ns/op
ConstantVsLogarithmicBenchmark.forMap         9  avgt   30  64.431 ± 0.687  ns/op

where "size" is a size of currency pair list. And this is average time across all keys, in practice that is not an equal distribution. (Details see here). Algorithm is important for sure, but until we correctly understand the computational model it based on. What is good in one case, may be a bit improved in another. :)

No comments: