Saturday, March 23, 2013

HPS source size over time

When I joined Cincom in 2007, source code cleanup for our VM was one of the first priorities for engineers because cruft was getting too much in our way.  This cleanup might sound easy to do, but it takes a large amount of effort precisely because so much cruft had accumulated.  What tends to happen is that one thing leads to another and all of a sudden what was meant as a removal of an obsolete feature involves investigating how every compiler in use reacts to various code constructs.  And then you also find bugs that had been masked by the code you're trying to remove, and these require even more time to sort out.

From a features point of view, it's unrewarding work because at the end of the day you have what you had before.  The key difference, which becomes observable over time, is that when you put in this kind of work then support call volume starts going down, random crashes stop happening, and the code cruft doesn't get in your way so you don't even have to research it.  As a result, the fraction of time you can spend adding new features goes up.

We've been at it for over 6 years now, and we can clearly see the difference in our everyday work.  Take a look:

  • VW 7.4 (2005): 337776 LOC.
  • VW 7.4 (2005): 338139 LOC.
  • VW 7.4a (2006): 358636 LOC.
  • VW 7.4b (2006): 358957 LOC.
  • VW 7.4c (2006): 359419 LOC.
  • VW 7.4d (2006): 358782 LOC.
  • VW 7.5 (2007): 358921 LOC.
  • VW 7.6 (2007): 357264 LOC.
  • VW 7.6a (2008): 350093 LOC.
  • VW 7.7 (2009): 345618 LOC.
  • VW 7.7a (2010): 270093 LOC.
  • VW 7.7.1 (2010): 270124 LOC.
  • VW 7.7.1a (2010): 270119 LOC.
  • VW 7.8 (2011): 261580 LOC.
  • VW 7.8a (2011): 261611 LOC.
  • VW 7.8b (2011): 261739 LOC.
  • VW 7.8.1a (2011): 261748 LOC.
  • VW 7.9 (2012): 252309 LOC.
  • VW 7.10 (2013): 240880 LOC (March).
As you can see, from the high water mark of 359419 LOC to today's 240880 LOC, we have removed 33% of the VM's source code.  Think of it: wouldn't it be nice to drop a third of your source code while killing a ton of bugs and adding new features?  Also, using the standard measuring stick of "400 page book" = "20k LOC", we can see HPS went from requiring 18 to 12 books.

We still have more code deletions queued up for 7.10, which are associated with various optimizations and bug fixes.  With a bit of luck, we'll reach 12k LOC (that is, about 240 printed pages) deleted in this release cycle.

Update: we went into code freeze in preparation for 7.10.  Here's an update on the code deletion.
  • VW 7.10 (2013): 240368 LOC (April code freeze).
Another 500 LOC bit the dust since March, and the VM executable became a couple kilobytes smaller too.  Finally, we're at basically 12k LOC deleted for the whole release.

Update 2: we gained about 500 LOC for 7.10, but only in exchange for IPv6 functionality.  I'll update the LOC count later since we know there are a few more fixes that will go in.

10 comments:

John Dougan said...

What does the compiled size look like (stripped binary)?

FDominicus said...

That really is a very good job. I guess I know waht I'm talking about. I bet the code in my old system could be could down even more aggressivly with that much code duplication I have. But it's always a problem to improve if you do not have tests.

Yes I know that should not be. But let's be honest, in how many old software packages do you really have tests?

FDominicus said...

That really is a very good job. I guess I know waht I'm talking about. I bet the code in my old system could be could down even more aggressivly with that much code duplication I have. But it's always a problem to improve if you do not have tests.

Yes I know that should not be. But let's be honest, in how many old software packages do you really have tests?

Andrés said...

@John Dougan: it's hard to measure because some of the changes include cleaning up makefiles (compiler options), or changing compilers, or changing compilation platforms (SDKs, libraries)...

But let me give you an example that is representative of at least some of these code deletions. The latest set of changes we put in does cleanup and optimization of various FFI-related prims and similar things. The Windows VM (on my compilation environment) is about 2kb smaller as a result.

Of course some other code deletions cannot really be measured because e.g. if you remove the Motorola 68000 code generator, then there's nothing to measure binary wise on non-68k platforms.

Andrés said...

@FDominicus: funny that you mention tests... I wrote the vast majority of the VM tests we run with every build. The test count is between 3800 and 4800 right now, depending on the platform. Don't tell anyone, but I'm about to integrate a set of new tests that will significantly increase that test count... ;).

I wouldn't say everything has tests. However, a lot of the new functionality we introduce does have tests (some bits just can't be tested mechanically in a reasonable manner). We also write tests that expose existing bugs so that they don't happen again.

John Dougan said...

I was interested from the point of view of memory bandwidth consumption. Sometime small code in the source blows up into large code in the binary. Obviously compiler and optimization setting dependent. Occasionally, I have had code run faster with compiler flags set for optimizing for size instead of speed.

Andrés said...

@JohnDougan: in some cases, the code deletion does result in less executable code, which in turn runs faster. However, sometimes it's complicated to measure the effect in specific cases. Does the resulting smaller code run very frequently? Does making it go away result in other code sections being closer to each other? Even if the code doesn't run, does the resulting binary need less cache line reads?

In general, we tend to follow the Intel's manual advice: less code size is generally better, less instructions (by count) are generally better. In specific cases in which we can see the code will get hot, we measure the specific improvement (e.g. ~30% speedup for large integer primitives because of better instructions in big endian platforms). We might also consider taking a 5% performance loss in a very infrequent execution path if we can delete a large enough amount of code. At the same time, I have added about 20kb of source code to the VM (with its associated few kilobytes more executable code) to get up to 40% faster GC in the common case. It depends.

Andrés said...

@FDominicus: speaking of tests, we basically doubled the number of VM tests from ~4500 to over 9000 in this release. This is up from zero stringent tests at the time I joined. It's amazing what you can find when you write rigorous tests...

FDominicus said...

"It's amazing what you can find when you write rigorous tests..."

That's a thing one probably does not just have to believe ;-)

It's obvious that you find more with better tests and it is still frustrating to find bugs not covered by tests, or isn't it?

Andrés said...

I'd say I've seen quite a bit of not *wanting* to believe that rigorous tests will find more bugs because of a variety of reasons... oh well, reality will catch up with that sooner or later :).

When I write a new set of tests and find bugs, I like having the opportunity to fix them especially before customers find out the bug was there. The new tests also serve as extra insurance against inadvertent introduction of new bugs.

I suspect a lot of the above has to do with this fever to add new features irrespective of the long term cost. It seems fun at first, and it seems like you're making a lot of progress at first. For longer term projects though, paying little attention to the details in the beginning can have a disastrous effect.