Friday, September 23, 2005

Very nice methods

Sometimes you just want to do something for all subsets of a particular collection (including the "empty" and "all" subsets). The typical stuff that comes to mind is to count from 0 to 2^collection size - 1 (heh, or collection size mersenne), then sift through the bits and add stuff at the proper index when the proper bit is 1, etc. You get the idea.

The other day I needed to take the products of all possible subsets of a collection of integers. I had already implemented the bit-sifting approach when I realized I had sinned against my own principles by having been too clever when it had not been necessary.

So I dumped all that bit-sifting code and, instead, followed what we actually do. Consider the collection $A to: $Z, and all its possible substrings (e.g.: 'ABCZ', 'JRTXY', 'F', etc). Clearly, all the substrings of $A to: $Z are all the possible substrings of $B to: $Z, once prefixed with $A and once not prefixed with $A.

The same argument applies all the way down to $Z to: $Z. So all the number iteration becomes process stack recursion in terms of sending messages, without any of the bit-sifting arithmetic nonsense.

So, for my product problem, I implemented these methods in SequenceableCollection:

    allProductsTimes: aNumber
    do: aBlock

      self
        allProductsTimes: aNumber
        startingAt: 1
        do: aBlock

    allProductsTimes: aNumber
    startingAt: anIndex
    do: aBlock

      anIndex > self size
        ifTrue: [aBlock value: aNumber]
        ifFalse:
          [self
            allProductsTimes: aNumber
            startingAt: anIndex + 1
            do: aBlock.
          self
            allProductsTimes: aNumber * (self at: anIndex)
            startingAt: anIndex + 1
            do: aBlock]


Isn't that absolutely beautiful?

Since the elements of the collection could be polynomials or matrices, then I can't assume the empty product is 1. The first method used to refer to the value of the empty product. But since the second method multiplies everything that follows by something that is given, these new methods ended having more functionality for free. Fantastic, fantastic.

3 comments:

michaelw said...

Wouldn't the code be more beautiful if the code would be factored a little more?

See my powerset solution.

Andres said...

I took a look at it. I had a hard time understanding the meaning behind the names a, b, x, s, and what seemed to me as operators of some sort: ->, =>, [a].

It seems to me that I could use very short names such as a, b, x, s, and that this would produce much tighter looking code. However, I am concerned that this would be at the expense of clarity which I consider very valuable.

Does the Haskell version need to be modified to accept the value of the empty product as an argument? What would you need to write to iterate through the products with an arbitrary piece of code (the aBlocks in Smalltalk)? Would it work if the collection had stuff other than integers (such as modular integers, polynomials, functions or matrices)? Sorry about my ignorance about Haskell, please let me know!

Incidentally, the powerset solution seems to build all the possible subsets of the collection in question. With a collection of size 20, it would create (I am no Haskell expert!) 2^20 collections. Is that so? This was something I explicitly wanted to avoid.

Regarding the lines of code, I would suggest that overall we're pretty close.

This is the first time someone takes something I wrote in Smalltalk and translates it to another language... it's exciting! :). Please let me know about the questions, I am curious!

michaelw said...

Because of the size I try to address your questions on my site.