Sunday, August 09, 2009

Super size that software!

Could anybody explain to me why shortcut icons for things like c:\program files\whatever\executable.exe are set by referencing c:\windows\installer\{e7e8eb0e90e-fidvihbd-jdsgvdsjhs}blah.exe, as opposed to something more reasonable such as c:\program files\whatever\executable.exe? Why would you want to keep the installer for today's huge programs around? If you delete them, then your shortcuts are damaged and need fixing, even though the icons are perfectly available at the shortcut's target. Skype does it one better and provides read only short cuts so it's not clear how you can fix them. This seems nonsensical to me. What's the point of all this "protection"?

Uninstalling Acrobat 6 left about 60mb worth of .bak files in Program Files (copies of executables, dlls, etc). Uninstalling Acrobat 7 left numerous empty folders around. And the latest incarnation of Acrobat 9 is 244mb. At the same time, there are standalone PDF viewers that require on the order of 10mb and conveniently ignore Acrobat's irritating limitation that the PDF reader will only annotate PDF files created by Acrobat.

With this kind of software, of course we need a 1TB laptop hard drive and 8 execution cores just to keep up with the mess. But note that memory speeds have not been upgraded accordingly. So, when you look at performance charts, adding a 4th core does not give you 33% more performance but rather 25% at best, and that's only when more complicated software has been written to make use of all the computation capacity.

And even then, the improvements in software do not necessarily matter much. After about 8 cores, memory intensive applications actually suffer performance degradation because the memory subsystem cannot cope with the demand for bandwidth coming from the CPUs. With today's data sets and application sizes, essentially every app is memory intensive. For example, why do Apple's dashboard applets need to reserve on the order of 370mb worth of memory?

At times it seems weird to me that we refuse to make good use of our free, massively parallel computation device called a brain to avoid inventing technologies that just need a problem to solve. In the same vein, I got anecdotal evidence that a major corporation funds software projects only when they can be completed in 3 months and pay for themselves in 6 months, after which it is perfectly fine to throw the shiny new app away because it's no longer an investment loss. Consequently, a non-trivial amount of code is periodically rewritten from scratch. So much for the actual value of software these days...

On a more positive note, this situation should not last forever. Eventually, we'll hit the on CPU cache limit past which CPUs need too much electricity. Eventually, we'll hit the miniaturization limit (we're about 10 years away at Moore Law speeds). Eventually, we'll hit the aereal density limit for hard drives (did you know the hard drive industry wanted a duplication every year or so?). Eventually, we will find out that Facebook consumes too much electricity for the valuable(?) service it provides. Eventually, it all comes down to the costs of managing unnecessary complexity. Sooner or later it will be cheaper to think, which is what we should have been doing all along.

3 comments:

gulik said...

I couldn't agree more!

I think that the sweet spot for computers is around a 1GHz CPU and 256MB of RAM. That's where the hardware can be cheap, fanless and not too power hungry. If standard desktop software needs more than that, then there's some serious bloat going on.

For storage, 2GB is a huge amount if you're talking about application data only. Years ago, I could fit all my university work in 800MB including Debian, gcc, emacs, latex, etc.

Sure, there are some applications that need a massive computer, but word processors, email clients, PDF viewers, heck... even compilers and image editors shouldn't need that much.

I've had an idea that you could make a 16-bit Smalltalk VM that uses 64k blocks and inter-block pointers. The reasoning is that the x86 architecture does 16-bit arithmetic at the same speed as 32- or 64-bit, and by using half the word size, you effectively get double the memory bandwidth and cache size. It would only work on x86 though; ARM and MIPS are 32-bit only (I think).

Andrés said...

That's interesting about the 16 bit VM. I am not sure what would happen. On the one hand, anything that fits within 16 bits would be faster. On the other hand, anything that does not fit would need extra pointer indirection. For smaller programs it might be a good idea to at least use a 32 bit model as opposed to, say, 64 bit pointers to count from 1 to 10. See here under "A Flame About 64-bit Pointers":

http://www-cs-faculty.stanford.edu/~knuth/news08.html

criticalhippo said...

Many ARM CPUs have Thumb (and/or Thumb-2) mode, which is a bit limited in terms of supported opcodes, but gets you your 16-bit instructions.