Friday, September 23, 2005

Very nice methods

Sometimes you just want to do something for all subsets of a particular collection (including the "empty" and "all" subsets). The typical stuff that comes to mind is to count from 0 to 2^collection size - 1 (heh, or collection size mersenne), then sift through the bits and add stuff at the proper index when the proper bit is 1, etc. You get the idea.

The other day I needed to take the products of all possible subsets of a collection of integers. I had already implemented the bit-sifting approach when I realized I had sinned against my own principles by having been too clever when it had not been necessary.

So I dumped all that bit-sifting code and, instead, followed what we actually do. Consider the collection $A to: $Z, and all its possible substrings (e.g.: 'ABCZ', 'JRTXY', 'F', etc). Clearly, all the substrings of $A to: $Z are all the possible substrings of $B to: $Z, once prefixed with $A and once not prefixed with $A.

The same argument applies all the way down to $Z to: $Z. So all the number iteration becomes process stack recursion in terms of sending messages, without any of the bit-sifting arithmetic nonsense.

So, for my product problem, I implemented these methods in SequenceableCollection:

    allProductsTimes: aNumber
    do: aBlock

      self
        allProductsTimes: aNumber
        startingAt: 1
        do: aBlock

    allProductsTimes: aNumber
    startingAt: anIndex
    do: aBlock

      anIndex > self size
        ifTrue: [aBlock value: aNumber]
        ifFalse:
          [self
            allProductsTimes: aNumber
            startingAt: anIndex + 1
            do: aBlock.
          self
            allProductsTimes: aNumber * (self at: anIndex)
            startingAt: anIndex + 1
            do: aBlock]


Isn't that absolutely beautiful?

Since the elements of the collection could be polynomials or matrices, then I can't assume the empty product is 1. The first method used to refer to the value of the empty product. But since the second method multiplies everything that follows by something that is given, these new methods ended having more functionality for free. Fantastic, fantastic.

Thursday, September 22, 2005

4 to the millionth power --- behold!

Please take a look at this. Come on, man... 4 to the millionth power isn't a challenge for a base-2 computer:

4^1,000,000 = (2^2)^1,000,000 = 2^2,000,000

or, equivalently,

1 bitShift: 2,000,000

Suggestion: try 1023^200,000 instead.

Sunday, September 18, 2005

Newton-Raphson common inefficiencies

If you need to calculate sqrt floor, then typically the Newton-Raphson algorithm will give you a proper answer very quickly, with the following words of advice.

Typically, the first value for the recursion is set to something like self // 2. But, my friend... that is not very insightful. If self = 2^1000, then the initial guess is set to 2^999 instead of something much more reasonable like 2^501. In that case, poor Newton-Raphson will have to burn through unnecessarily large integer multiplications and additions just to get to 2^501, from which point the real work begins... sigh... so please, use 1 bitShift: self highBit // 2 + 1.

I didn't suggest 2^500 as the initial guess because Newton-Raphson works by approximating from above, not from below.

Interestingly, for small large integers, in VisualWorks it's faster to find bits of the square root by using the fact that, for h = 2^n,

(r+h)^2 = r^2 + 2rh + h^2

can be calculated without any multiplications whatsoever (you can initialize r = 1 bitShift: self highBit // 2, r^2 is obviously 1 bitShift: self highBit - (self highBit \\ 2), and then you can remember the new r^2 = (r+h)^2 from the previous step). Eventually, however, Newton-Raphson becomes more efficient.

But if you are in a real hurry, then by all means use that floating point processor you have. Take the 50-whatever most significant bits of the large integer, convert them to a double precision floating point number, obtain a square root for that, then shift the answer back into place and use that value + 1 for the initial guess in your lovely Newton-Raphson implementation. Then Newton-Raphson won't have to burn through large integer arithmetic to obtain 25-plus bits of the correct answer --- plus in the next iteration, it will duplicate the amount of correct bits...

This means that since SmallIntegers fit in a double precision floating point number, Newton-Raphson should be implemented for LargeIntegers only.

Friday, September 16, 2005

Creative (at least for me) use of exceptions

So... number crunching... one of the things I got done before I got this ugly flu was a sieved Fermat factoring algorithm. Essentially, if N is odd,

N = pq = (x+y) (x-y) = x^2 - y^2

And if you set m = N sqrt floor, then

N = m^2 + r

If you require than gcd (m, r) = 1 to detect trivial factors, then 0 < r < 2m because of m's maximality. Since we need to find pairs of x^2, y^2, we can increase m by j and operate. Thus,

N = (m+j)^2 - (j^2 + 2jm -r)

Getting bounds for j is not hard based on the difference between (x+y) and (x-y) --- it can't be just about anything. So the thing is, within the range of possible values of j, to find values of j such that, for some k,

f(j) = j^2 + 2jm - r = k^2

Clearly, you look at f(j) mod some primes and determine which values of j cause f(j) to be a quadratic residue mod some well-chosen integers. The choice of moduli is too much to write about now, but there is a closed formula for the amount of quadratic residues mod x based on the factorization of x, and then you can see which moduli are most effective based on the ratio between the amount of quadratic residues and the modulo.

Anyway. Let's say you start with 16, and you find that j = 0, 3 and 6 mod 16 cause f(j) to be a quadratic residue mod 16. You eliminated 13/16 of the total candidates. But there are still so many values of j that you need to sieve further. So you choose, say, 27, and try again. But now you only try numbers mod 16*27 which are 0, 3 and 6 mod 16 to begin with. And so on, until that particular line has just a few possible values of j, at which point you brute force them by checking if they are squares for real.

Before, I had chosen to reuse my SchedulingSetQueue, so the thing would generate tons of modIntegers which represented some possible values of j. But the thing was that the size of the queue was out of control, and the possibility of duplicates forced me to keep all the modIntegers seen by the queue in a Set. A ton of work!

So, since I was caching all the modIntegerRings with their quadratic residues anyway, I thought I could replace all that recursion flattening with a stack recursion strategy and make it go faster.

Faster indeed. But I had to overcome a problem. Once the sieve finds a factor, it needs to kill the whole recursion and exit. And of course, the recursion was implemented in terms of several methods. There was no single checkpoint where I could write the fast-out code. And passing around the necessary information so that the recursion could unwind itself felt disgusting.

Solution? Start the recursion with on:do:, and watch for SievedFermatShouldStopEvent notifications. If such a thing happens, ex return: nil and that's it.

Fantastic! So I can just write code as if the normal case is all that happens, and break right out of it by means of signaling an exceptional event such as "factor found --- stop!". So clean, so nice!

Thursday, September 15, 2005

asm in Smalltalk

I'm back into number crunching. One of the things that I thought would be useful for some situations would be the ability to write pseudo-asm or C inside a method, and let some GCC-like compiler behind the scenes do the dirty work of producing native code out of that.

I don't mean Slang. I mean something like {c code statements;} or {assembler statements} right inside the method, with the compiler figuring out all the name references.

I don't like the separation between C-world and Smalltalk-world. I don't want to switch to another editor, write some stuff, then try to make a VM plugin that doesn't blow the whole thing away because of some call protocol that has a bug, etc. I do not want to see the compile-try-crash-debug cycle again. Why can't just I write the C code I need in a method, test it with SUnit, and if it works then all is good that's it?

Sunday, September 04, 2005

NOLA --- open your eyes once and for all!

Does it take writing stuff like this to make others kick the dope of arrogance? Seriously guys, it's very very late and it's time to wake up already, ok?

I don't think people have realized what this disaster is yet. It reminds me of the large groups of dumb mammals which watch their kin being eaten by the lions from a safe distance thinking it's fine because it wasn't them.

How disgusting.

Large folders in NTFS, part 2

A while ago I commented that, on my fast hard drives, reading an unfragmented folder with about 55k files took over a minute the first time, and 17 seconds the second time after everything was cached.

I subsequently burned all those files to a DVD. Reading that folder from the DVD for the first time without any caching whatsoever took less than 10 seconds.

There must be some scenario in which the current NTFS strategy wins, but... hard drive vs. DVD and the optical drive wins, even giving the advantage of everything cached vs. first read? Is NTFS so featureful that is has to be slow?

So my friend... NTFS for file server work? You have to be kidding me.