Wednesday, October 25, 2006

Follow up on types

I got a comment that started by quoting a question I made in the original post, "And why do type systems cause so much grief?". What I meant by that question was why talking about types causes so much friction.

The comment went on saying that lack of precision when talking about type systems can be misleading (my own paraphrasing). However, the rest of the original post talks about specific characteristics of type systems. For example, if they explicitly appear in your development work, they cause grief because it's more stuff that you need to take care of.

I wondered last night about the fraction of the developer time that is spent dealing with explicit type systems as opposed to doing the work. What could it be in languages like Java?

Let's say it's 10%, for illustration purposes. Now, go to a manager and tell her/him that 10% of the development resources are going into a system that addresses a particular kind of simple mistakes. Not even the hard ones, not even most of the simple ones, but just a particular variety of the simple ones. So, out of 10 months of work, 1 goes into (essentially) fluff.

I do not think the realization would be well received. Does anybody know the actual time percentage?

12 comments:

Greg Buchholz said...

I wondered last night about the fraction of the developer time that is spent dealing with explicit type systems as opposed to doing the work.

Good luck with that. I think the advocates of static typing are going to ask the similar question of dynamic typers, "I wonder what fraction of developer time is spent dealing with bugs that could have been caught with a good type system and the lower productivity that comes from dynamically typed languages ;-)"

Part of the problems might have to do with terminology. I think dynamic typers think of type errors as those exceptions you get when you try to take the square root of a string. Static typers have a broader notion of type error, up to and including almost any undesirable behavior. Of course, you'll need a language with a pretty expressive type system to take full advantage of this (Haskell might be one example).

My own opinion is that much of the static/dynamic divide is bogus. Type systems are just another metaprogramming system for describing, proscribing, and analyzing program behavior. A static program is just a refactored dynamic program.

Now if were talking about something more specific, like whether Smalltalk is better than Java, then we might have something to work with, but just talking about static vs. dynamic still seems a little nebulous.

Andres said...

Regarding the answer to the analogous question for dynamic languages, sure... it would be great to have proper answers that can be reproduced consistently. Alas, I do not know of such things.

As far as I understand them, explicit type systems are meant help you take care of bugs like what you describe, 'abc' sqrt. Most blunt errors may be like these, but in my experience they require significant less time than figuring out e.g.: why the implementation of Newton-Raphson needs to subtract by one at the end most of the time.

Overall, and again in my experience, explicit type systems make me spend much more time dealing with the mandatory rituals than actually doing work --- to the point that it affects my productivity.

Perhaps they help more when you do not follow the Once And Only Once rule, but... hey maybe the problem does not have to do with type systems in that situation.

Regardless of which language we're talking about, I think the point still stands: if a type system makes me spend time on it while I work, or cause work that becomes obsolete overtime, or distracts me by explicitly appearing in my development work, then it is my opinion that it causes more damage than it helps.

Greg Buchholz said...

if a type system makes me spend time on it while I work, or cause work that becomes obsolete overtime, or distracts me by explicitly appearing in my development work, then it is my opinion that it causes more damage than it helps.

Sure, that's a pretty uncontroversial statement. No one wants to use a bad language. That's why you should use a good statically typed language, so you don't have those problems.

Andres said...

"No one wants to use a bad language".

I'd suggest that evidence contradicts that assertion :).

Now seriously, which good typed languages would you recommend (other than Smalltalk)?

Thanks,
Andres.

Greg Buchholz said...

Well, if you really want to twist your mind a little, I'd recommend trying Haskell. Grab a copy of The Haskell School of Expression (ask your librarian for an inter-library loan, or buy it used on amazon, and sell it back if you don't like it). I found that the language had enough different ways of doing things (besides being a type-inferred polymorphic static language, it is a higher order, lazily evaluated, and purely functional language), that a book was the better way to learn (rather than on-line tutorials). I guess I should also mention the two line quick sort, since that seems to be a favorite introductory program.

Andres said...

Greg,

I took a look at the two line quicksort. Will the Haskell version create new temporary collections to concatenate at each invocation of qsort?

Thanks,
Andres.

Greg Buchholz said...

Will the Haskell version create new temporary collections to concatenate at each invocation of qsort?

Yep. It's a demonstration of brevity, not efficiency. From the same "Introduction" page...

It isn't all roses, of course. The C quicksort uses an extremely ingenious technique, invented by Hoare, whereby it sorts the array in place; that is, without using any extra storage. As a result, it runs quickly, and in a small amount of memory. In contrast, the Haskell program allocates quite a lot of extra memory behind the scenes, and runs rather slower than the C program.

In effect, the C quicksort does some very ingenious storage management, trading this algorithmic complexity for a reduction in run-time storage management costs.


Of course, you could always use the built-in sort, which has the usual time and space bounds. But another interesting thing is that since it is lazy, "sort" can be used to find the minimum element of a list in O(n) time, instead of O(n*log(n))

min = head . sort

Calum said...

I wondered last night about the fraction of the developer time that is spent dealing with manual refactoring as opposed to doing the work. What could it be in languages like Ruby?

Let's say it's 10%, for illustration purposes. Now, go to a manager and tell her/him that 10% of the development resources are going into a system that addresses a common kind of development activity. Not even all development activity, not even most development activity, but just a particular variety of development activity. So, out of 10 months of work, 1 goes into (essentially) fluff.

Andres said...

Calum,

I thought a lot about your comment, and I couldn't get around a problem. Most development work is refactoring. Or, in my opinion, could be seen as refactoring / improvement work.

I am suspicious that perhaps you have seen other software development practices where refactoring is homework you have to do after you are "done". I heard that like 10 years ago, and decided it was silly do knowingly leave yourself homework because of inattention or omission.

Hence, I put myself to learn to refactor by refactoring everything I write as I write it, with the idea that I would become proficient and writing refactored code would become second nature.

I am happy I decided to do that 10 years ago.

Thanks,
Andres.

Calum said...

Hi Andres

It's the same with static typing - it's something you do as you go along, not just at a particular point in the process.

Anyway, my original point was supposed to be that you only looked at the cost of static typing (more code, less flexibility etc.) without considering the benefits (better static code manipulation - refactoring etc.) - which can help reduce the time for other parts of the development.

"1 goes into (essentially) fluff" - but you may get that back in other ways, such as far more efficient (ongoing) refactoring.

Note I'm not necessarily arguing that static typing is better - just saying that the particular argument used against static typing was fallacious.

Andres said...

Calum,

My original post says that the 10% going be going into addressing some of the simple mistakes we all do when we write programs.

However, I do not think that this exchange is beneficial, as it would prevent me from spending 10% more time into designing something better that would have less of a tendency to contain simple mistakes.

One could, for instance, allocate 8% more time to design, and 2% of the rest to find the simple errors via functional testing.

Thanks,
Andres.

Calum said...

Hi Andres

Your original post talked about going to your manager saying that 10% of the time was spent addressing simple mistakes, and therefore that this work shouldn't be done.

However, I think this would be misleading the manager, because you haven't told him/her about the benefits of static typing.
For example, maybe you could tell him/her that you save your 2% from simple mistakes, another 4% from more complicated mistakes (as in in-line warnings from modern Java IDEs), another 4% from automated refactoring and another 2% because it helps self-document the code when you come back to it in 6 months. (All made-up figures, of course).

I'm sure the manager wouldn't appreciate being told only the disadvantages and not advantages - which is what your original posting suggested to me.

So, it doesn't directly matter that 10% of your time is spent dealing with explicit type systems.
What matters is weighing the amount of time you save by using the explicit type system against the cost of using it.