Sunday, June 29, 2008

The Fibonacci showdown: F(a+b) vs Dijkstra

The comment thread in this post has over 30 entries now! Something that did come up in those discussions is why not cache the Dijkstra recursion so it goes really fast, and how does that compare to the F(a+b) calculator.

For a fixed F(k), it seemed that caching Dijkstra made it run faster than F(a+b), even when F(a+b) was allowed to cache everything. So, to determine whether this is true in general, I did a random F(k) test.

The task is to calculate F(k) Fibonacci numbers with k in this array, and in the order shown:

#(1702587 1848601 135328 1144978 183508 1421520 734333 200747 1750499 1795081 307556 1726299 1911101 1997485 521440 1954962 979243 1793538 1633868 323412 1866722 612481 1155724 441908 1135563 1338921 815458 520627 873849 432030)

These numbers, between 0 and 2049151, were devised using the MinimumStandardRandom (Park-Miller) RNG in VW 7.4.1. So here we go. The first run time numbers are:

  • Non cached Dijkstra: 191.4 seconds.
  • Cached Dijkstra: 162.5 seconds.
  • Cached F(a+b): 158.2 seconds.
  • Cached and Greedy F(a+b): 156.2 seconds.
Since the cached Dijkstra and the Greedy F(a+b) calculators will take no time if asked to do this again because they cache every F(k), and since the non cached Dijkstra time will not change because it's not cached, it only makes sense to run this a second time for the cached F(a+b) calculator which is only caching F(k) triplets around powers of two indices.
  • Cached F(a+b): 148.2 seconds.
So, performance wise, if one is going to be calculating random F(k) numbers, the best choice is to go with the greedy F(a+b) calculator. The next best choice is to use the cached F(a+b) calculator. After that comes the cached Dijkstra calculator, and finally the non cached Dijkstra calculator.

At this point, it may be useful to examine how many entries the cached calculators have in their caches.
  • Cached Dijkstra: 1116 entries.
  • Cached and Greedy F(a+b): 1117 entries.
  • Cached F(a+b): 107 entries.
So, the cached F(a+b) calculator beats Dijkstra's cached calculator using 10x less cache entries. While doing so, it also manages to be very close to the greedy F(a+b) calculator.

As such I'd be inclined to say that, for general F(k) calculation use, the cached F(a+b) is a good first choice.

Update: I also tried Hakmem item #12, but it seems considerably slower than F(a+b) and Dijkstra's method.

10 comments:

Suspect said...

I wrote a version of the calculator which uses the closed form phi-based expression for fib(n).

I ran into two problems:
(1) Obtaining the value of sqrt(5) to the required degree of precision takes a long, long time. The expression only works, then, for values of n lesser than the one that sqrt(5) was calculated for; at N > n, the results are wrong, which means a recalculation of sqrt(5). I guess a fast spigot-based algo could be written for this part, but I'm too lazy :)

(2) Arbitrary precision floating point math is slow. Raising phi, with 81000 significant digits, to 100000, takes a long time. My greedy cached F(a+b) calculator gives that result instantaneously. Again, I did not time the results, but there was a noticeable speed difference.

Admittedly, I was using Python's standard Decimal library, and perhaps in Mathematica, etc., the results would have been better.

Andres said...

Suspect,

Yeah... I was thinking that those problems would crop up. In those cases, it seems to me (at least at first sight) that going the integer route is better.

Sooner or later, though, I will have to take a look at the VisualWorks primitives and see what Knuth algorithm suits the requirements better.

Andres.

Anonymous said...

Why don't you try Gosper&Salamin's method (HAKMEM Item 12)?
I tried it and got the result for that list of numbers in 9s.

Andres said...

Anonymous,

> Why don't you try Gosper&Salamin's method (HAKMEM Item 12)?

Because I wasn't aware of it :). I tried to find such things with Google, but for some reason I do not remember encountering this one.

> I tried it and got the result for that list of numbers in 9s.

Sure, I'll try and see how it compares to the others.

Andres.

Andres said...

Anonymous,

I just implemented Hakmem 12. I had to stop it at number 17 because it had already spent something like 10 minutes running.

This makes me wonder if I am not doing something stupid performance wise, I find it hard to believe that large integer multiplication is THAT different, performance wise. In fact, I am getting the feeling that the issue is somewhere else.

In general, however, it seems that the F(a+b) or Dijkstra's method are faster than Hakmem 12.

Andres.

Andres said...

Now I think I know what's going on with big integer multiplication: the VM knows nothing of Karatsuba's method much less of the FFT method for truly huge numbers. So really, these times are grossly affected by the poor algorithm currently used.

Sigh... more homework :).

Andres.

Anonymous said...

First, when using Gosper&Salamin you should use fast exponentiation (you probably already did), i.e., repeated squaring and multiplying.

And yes, you need fast bignums. I was using GHC which uses GMP.

nicolas cellier said...

See also Fibonacci-nice on public store.

I would expect MatrixIteration to perform better, but that's not the case.

I guess if some large integers are:
a > b > c > d ...

it then would be better to evaluate:
(a*d) * (b*c)

rather than:
((a * b) * c) * d

This can explain this relative lack of performance (globally efficiency comparable with yours, sometimes outperforms with lucky bit patterns...).

Andres said...

Anonymous,

> First, when using Gosper&Salamin you should use fast exponentiation (you probably already did), i.e., repeated squaring and multiplying.

Yep, that's what I did.

> And yes, you need fast bignums. I was using GHC which uses GMP.

Ah I see... I found out the VM yet does not know about Karatsuba much less FFT... yet :). Sooner or later I'll try to get this addressed.

Andres.

Andres said...

Nicolas,

Note that (although I do not remember exactly which, and I am in a bit of a hurry because I need to check out from this hotel in 25 minutes) one of the recursions already shown in this post is the simplified version of the matrix method --- only without the matrix :).

Andres.