Sunday, June 29, 2008

Printing very large integers faster

The Fibonacci comment thread included some comments about how much time it takes to print very large integers in base 10. I went through the existing implementation, and using F(1953125) as a test case for the profiler, I managed to get the printString about 8% faster than the base VW code. The same happens with F(390625).

The mechanism I used was basically the same than the existing one: divide the receiver in half and try again until you hit small integers. What I did however was to tune the integer division process, hold all the powers of 10 involved, use the stack as I did with CharacterArray>>match: in the mentoring course book, and abuse the heck out of polymorphism.

I think it's great, but certainly it's not your "margin note" implementation. Here we go...

    Integer>>printBaseTenOn: aStream

      self < 0
        ifTrue:
          [
            aStream nextPut: $-.
            0 - self printPositiveBaseTenOn: aStream
          ]
        ifFalse: [self printPositiveBaseTenOn: aStream]


    Integer>>printBaseTenString

      | answer |
      answer := (String new: 16) writeStream.
      self printBaseTenOn: answer.
      ^answer contents


    LargeNegativeInteger>>printBaseTenOn: aStream

      aStream nextPut: $-.
      0 - self printPositiveBaseTenOn: aStream


    LargePositiveInteger>>printBaseTenOn: aStream

      self printPositiveBaseTenOn: aStream


    LargePositiveInteger>>printPositiveBaseTenOn: aStream

      | initialChunkSize initialChunks |
      initialChunkSize := 10 raisedTo:
        (SmallInteger maxVal log: 10) truncated // 2.
      initialChunks := OrderedCollection
        with: initialChunkSize
        with: initialChunkSize squared.
      self
        printFirstChunkPositiveBaseTenOn: aStream
        chunkSizes: initialChunks
        chunkIndex: 2


    LargePositiveInteger>>
      printFirstChunkPositiveBaseTenOn: aStream
      chunkSizes: chunks
      chunkIndex: chunkIndex

      | chunkSize |
      chunkSize := chunks at: chunkIndex.
      chunkSize highBit * 4 < self highBit
        ifTrue:
          [
            chunks size = chunkIndex
              ifTrue: [chunks add: chunkSize squared].
            self
              printFirstChunkPositiveBaseTenOn: aStream
              chunkSizes: chunks
              chunkIndex: chunkIndex + 1
          ]
        ifFalse:
          [
            | firstChunk |
            firstChunk := self // chunkSize.
            firstChunk
              printFirstChunkPositiveBaseTenOn: aStream
              chunkSizes: chunks
              chunkIndex: chunkIndex - 1.
            self - (firstChunk * chunkSize)
              printPositiveBaseTenOn: aStream
              chunkSizes: chunks
              chunkIndex: chunkIndex - 1
          ]


    LargePositiveInteger>>
      printPositiveBaseTenOn: aStream
      chunkSizes: chunks
      chunkIndex: chunkIndex

      | chunkSize |
      chunkSize := chunks at: chunkIndex.
      chunkSize highBit * 4 < self highBit
        ifTrue:
          [
            chunks size = chunkIndex
              ifTrue: [chunks add: chunkSize squared].
            self
              printPositiveBaseTenOn: aStream
              chunkSizes: chunks
              chunkIndex: chunkIndex + 1
          ]
        ifFalse:
          [
            | firstChunk |
            firstChunk := self // chunkSize.
            firstChunk
              printPositiveBaseTenOn: aStream
              chunkSizes: chunks
              chunkIndex: chunkIndex - 1.
            self - (firstChunk * chunkSize)
              printPositiveBaseTenOn: aStream
              chunkSizes: chunks
              chunkIndex: chunkIndex - 1
          ]


    SmallInteger>>printPositiveBaseTenOn: aStream

      self < 10
        ifTrue: [aStream nextPut: (Character digitValue: self)]
        ifFalse:
          [
            self // 10 printPositiveBaseTenOn: aStream.
            self \\ 10 printPositiveBaseTenOn: aStream
          ]


    SmallInteger>>
      printFirstChunkPositiveBaseTenOn: aStream
      chunkSizes: chunks
      chunkIndex: chunkIndex

      self printPositiveBaseTenOn: aStream


    SmallInteger>>
      printPositiveBaseTenOn: aStream
      chunkSizes: chunks
      chunkIndex: chunkIndex

      | chunk toGo |
      chunk := chunks at: chunkIndex + 1.
      toGo := self.
      [chunk = 1] whileFalse:
        [
          chunk := chunk // 10.
          aStream nextPut: (Character digitValue: toGo // chunk).
          toGo := toGo \\ chunk
        ]


For some reason, the switch from printFirstChunk to normal printing makes me think about electron energy levels.

No comments: