Friday, December 26, 2008

Friday, December 19, 2008

Of Madoff and Ponzi schemes

So, Madoff apparently ran a Ponzi scheme quite successfully for about 48 years. It is widely reported that he offered a 10% return regardless of the market conditions. In other words, assuming he ran this from the beginning, he would have managed to increase capital by about 100x.

I have a feeling that he might have designed the return rate so the scheme would blow up after he was dead, but that this latest crisis was too much for the cushion he had. He has been quoted as having trouble coming up with $7 billion to take care of redemptions a few days before he allegedly confessed. If he had not been forced by redemption requests, he'd be still going at it.

But what if he had offered a less attractive rate instead of 10%? Then the theoretical pile of capital would have grown slower, and so he might have had more of a chance to weather the storm. How much of a rate would that be? Particularly, what is the rate that would allow a modern Madoff to run a Ponzi scheme until after the creator's death, thus managing to walk off with a clean Louis XIV move?

Well let's see. 48 years at 10% gives us a factor of 97x or so. So now let's say we want to spread that factor over 60 years instead of 48. That gives us our Ponzi scheme return rate: 7.9%. In other words, if we start when we're say 35 years old, then running a Ponzi scheme Madoff style at 7.9% basically guarantees that the thing will explode after we die.

Now, Ponzi schemes are usually spotted by a ridiculous return. But what if they just don't do that? How could you tell the difference between a Ponzi scheme and a legitimate investment operation if you do not know the internals of how the money is invested first hand? And how many of you actually go through the due diligence? If anything, even professionals such as Banco Santander clearly did not and their customers (meaning us) lost upwards of $2 billion.

This means that any investment that offers us essentially 7.9% annualized return could, in theory, be a Ponzi scheme too --- simply because in practice we will not be able to tell the difference by looking at the return rate alone.

So let's be even more conservative and run our imaginary Ponzi scheme for 70 years. Our calculations show we should offer 6.75%. So basically, we conclude that the most conservative of Louis XIV Ponzi schemes could go ahead and attract investors with a return of 6.75%.

Of course, this is assuming the Ponzi scheme behaves too well and pretends to have gains all the time to prevent redemptions. But how could you tell the difference if the Ponzi scheme claimed losses that are not worse than everybody else's losses? In other words, how do you know for a fact that things like your retirement funds are not in a Ponzi scheme that is designed so that you do not take your money out? For example, such a scheme could a) claim to lose money when others do, b) put stiff penalties for redemptions (like the usual 10% on a 401k), c) make you wait until you're 59.5 years old...

... oh...

All of a sudden such things do not look good at all, do they? And it only gets worse. How much is the stock market usually said to return over time? 11%, right? Well, this means that, using Madoff's scheme as a ruler, we should borderline expect a full blown stock market crash at least once every 43 years, if not more frequently.

And this is where we're told to put our retirement in? You have to be kidding. This means that basically everybody's retirement will be affected by at least one major stock market crash.

Then you realize: banks also lend out more than what they have because we have a fractional reserve system. The idea is that this Ponzi scheme will not blow up because not everybody will withdraw their money at once. So why is this Ponzi scheme legal now? Says who?

But regardless. How many years could we expect the banks' Ponzi scheme to stay alive? If we assume the answer is 80 years (great depression to now), then the annualized rate of growth we get is about 5.9%. How does that compare to other investments?

Now we know that our setup is doomed to fail periodically. Guess who will bailout the banks when we cannot repay the loans back: that's right, us again. And if we cannot print money to pay for the mess because nobody cares for that, guess what: we pay in titles of interest such as real estate, or companies, etc.

The only winners in this game are those very few that set it up in the first place. Are we clever or what...

Saturday, December 13, 2008

Fundamentals update

Chapter 4 reached 100 pages. The draft is 216 pages long.

On debt

Sometime ago I met a guy that tried to sell me a debt reduction program. Basically, the deal was that I had to pay $3750 to get a computer program that would tell me how to allocate payments to serve any debt that I could have. Furthermore, he claimed this program was able to compress debt. He put himself as an example: he had 19 years to go in his mortgage, but he paid it in 6.

And you know why this works? Because the program has an algorithm. He even asked me if I knew what an algorithm was.

Now of course, since this works, I can recoup my $3750 by selling the program to a few other people that I know, and since I get $1000 every time I sell it... and so on... well, you get the idea. A typical multi-level marketing scheme.

He must have observed I wasn't too moved by his arguments, because then he suggested that I could buy the program used to do evaluations on people at $175, and still get the $1000 per customer I dupe into this.

So let's set the record straight, algorithm and everything.

The most important rule for debt reduction is simple. You can do whatever you want with the money you earn, but you are not allowed to increase your debt.

Let's say you want a camera today but you only have 40% of its price in available cash. Well, that's too bad: you get to save money month by month until you can buy it cash.

Or let's say that you want to buy a camera today, but you will have all the cash available in a few days. Fine, use the credit card. But when the statement comes, you are not allowed to let it go higher than before.

Now that things are not getting worse, we go to the second rule. This one is about scheduling payments. The debt dollars that cost you more in interest are those attached to higher interest rates. Therefore, you must pay those first. And yes, that means that if you owe $100 at 20% and $10k at 19%, you pay the $100 at 20% before you go after the chunk at 19%.

How do you pay the higher interest debt first? Simple. By allocating minimum payments to everything else, and then using as much as you can to go after the high interest debt.

For the sake of illustration, imagine you have $10k at 8%, $15k at 7%, and $20k at 9%. Assume you have $1000 to pay bills. Then, you make the minimum payments on the $10k and $15k, let's say those are $190. So you have $810 left, and that's what you pay on the $20k.

So now, rule 1 says you will not get worse. Rule 2 says you're paying in the most efficient manner. The third and last rule says that you must include every debt into this arrangement. In other words, your car and mortgage payments also count.

If you follow this strategy, then you will be out of debt in no time. And please, don't do stupid things like paying $3750 to have a program tell you how to allocate payments. Besides the fact that this would violate the first rule, you do not really need help to pay bills yourself. Also, you will avoid paying for the cruises these salespeople take.

Pfft... algorithm...

Finally, what do you do when you pay all your bills and have zero debt? Assume that your savings account needs money, and periodically drain any gross excess out of your checking account. You will be amazed at how quickly you can build up savings in this way. Then get yourself some goals and go after them.

PS: The fine folks at Debt Consolidation Connection left me the following comment.

Charge the $3750 to your credit cards. That would be a good start!

Not only this is spam: it is nonsense. Gee, you're so much in debt that you're looking for help, so how about you pile another $3750 on your credit card to start getting rid of your obligations? You have to be kidding --- or scamming people into giving you money for no good reason.

Oh, and by the way, multi level marketing schemes are a whisker away from being a pyramid / Ponzi scheme, which would make them illegal. And don't just take my word for it, see here.

Friday, December 12, 2008

Say hello to e^x: illustration

Look at this lady talking about her situation.

Ms. Leavitt said she recently discussed her investment with a friend who told her he was suspicious about the firm's ability to generate such profits amid the economic crisis. "I thought, 'He's probably just jealous,' " said Ms. Leavitt. "We've been with [Mr. Madoff] for 15 years, and it's grown every year at 10%."

Really. Mr. Madoff is currently out on bail because apparently he confessed to running a Ponzi scheme worth $50 billion dollars.

Ten percent yearly growth is clearly e^(ln 1.1)x, where x is time measured in years. You know that can't keep happening forever --- even if you wish with all your heart that it was possible. So sorry, but no.

Susan Leavitt of Tampa Bay, Fla., said she had several million dollars of inherited money invested in the firm and added $500,000 earlier this year. A stay-at-home mother with two children, the 46-year-old Ms. Leavitt says she is considering going back to work. "That was my nest egg for the children, and my future. I'll never see much back, I'm sure," she said.

What to say, right? Well, my friend, there you go. Next!

Thursday, December 11, 2008

Say hello to e^x

I had an interesting conversation today. We were discussing the auto industry mess, and what was the root cause of all of it.

Now, to me, in the end all of this is due to thinking that there will be exponential growth forever. It just doesn't work that way. We love it when we choose to believe it does, because we do well at the expense of somebody else that suffers the consequences. We never see those people, so everything is cool. But when we (meaning the people that happen to live in one of the countries that usually do well) do not do well, chances are nobody is doing well, and therefore we say things like "oh this is global", "everybody is in the same boat" etc. In other words, we have no real incentive to look at our behavior.

But unfortunately I couldn't advance this argument on the guy because I was sure he'd look at me funny if I invoked the presence of one of my dear friends, e^x. So what to do?

I decided to put in these approximate terms. I told him I was the Federal Reserve, and that he represented all the people in this country. Furthermore, I set the rules saying that I was the only entity able to create money. Then I told him that at the beginning of the game, he has no money. So what would he do?

Of course he asked me for money. "Ok", I said, "I am going to give you $1 trillion dollars". And, giving him four packets of sugar, I told him that I would charge him interest and so he would eventually have to pay me back five packets of sugar.

At this point he said "well but if I do not make money, then I do not pay you back and then you lose money". My answer: "and even if you do well, where do you think you're going to get the money from?... remember, I am the only person that can give you money".

In other words, I could not care less if everybody goes bankrupt. They still owe me, and I know because a) I gave them the money in the first place, b) I am the only person that can do that, c) I charged them interest so they owe me more than what they have. Besides, in the real world, he guarantees the money I decide to print, so I know for a fact that this cannot go wrong.

He agreed to these ridiculous terms.

Eventually he realized he'd have to come back to me to get more money so he could pay me back the interest on the $1 trillion. I told him I'd charge him 5%, and so he needs $50 billion. Then I picked up two more packets of sugar and gave those to him. Of course I'd charge him interest on that too, and so I decided that although he had 6 packets of sugar, he owed me more like 8.

At this point he realized the whole exercise was absurd, and gave me back all the sugar saying "here, take your money back, I do not want to play this game". I answered with "sure, but you still owe me, so what I am going to do after you run out of money is to print myself a little more money and buy everything you have for $1". After all, he owes me, he doesn't have the money to pay, and so I can wait any amount of time necessary for him to get so indebted that I can pay him whatever I feel like --- even $1.

He didn't like the consequences. So after a moment of reflection he goes "but well, in the end all this money is worthless... why do you (meaning the Fed) play this game?". So I said "it gives me power to do whatever I want --- for example, since I managed to get you to believe these pieces of paper are worth something, I can give these to you and then I can have whatever I want in exchange". Puzzled face. "In other words, since I own everything, then you are my slave because remember, you still owe me". Silence. "Have you ever seen a bank manager such as myself go to jail, do badly, or suffer any kind of hardship, here or in any country?". More silence. "See?".

Now, back to my tea.

Wednesday, December 10, 2008

Smalltalks 2008 --- Trip to Tigre

On Sunday, after the conference, the organization committee treated Alan Knight, James Foster, June and Monty Williams, and Victor Koosh to a rowing trip. James Foster just posted his set of photos of the event at the Buenos Aires Rowing Club. Enjoy!

Sunday, December 07, 2008

Implicit receivers, v3.0

It seems to me that what I said in my original post has taken a life of its own and is now evolving into something that I do not think I ever meant to say. So I thought I'd offer the following clarification.

What I do not like of implicit receivers, as implemented in a number of languages, is that their design is guided by a desire to have an economy of writing. This dislike, personal and subjective as it is, comes from my own experience: pushing too strongly to write less will lead to code that becomes unnecessarily hard to read as time goes by. So, over the years I've changed my style and choice of wording with reading speed in mind.

In other words, given the usual statistic is that programmers spend something like 90% of their time reading code, then I'd rather optimize that instead of the fraction of a second that I will need to type "self" every so often.

This is the whole point of what I wrote in the first place. A matter of writing code designed with some specific reading goals in mind, and nothing else.

Now, given this, what I have read and heard gives me the impression that it is perceived that what I've gone after is the whole idea of having a receiver other than self or something like that. If I understand correctly, what Newspeak uses the possibility of implicit receivers other than self for is to access behavior from an enclosing class. As far as I can see, that would be roughly equivalent to using super, but since the idea of super does not seem totally adequate for the purposes of Newspeak (or Self... what does super mean when you don't have a class hierarchy?), then we have implicit receivers so that messages can naturally flow to either self or outer.

Again, as far as I can see, this arrangement is used for the sake of modularity in Newspeak. I certainly sympathize with Gilad's take on modularity. Globals, singletons and the whole lot can be an invitation to poorly written code, and I've also seen more than my fair share of that. This was evidently clear to me back at the keynote presentation at Smalltalk Solutions 2008 when I heard Gilad's opinions first hand, so I think any perceived disagreement of goals regarding this matter should be disabused from existing.

This does not mean that I do not still think that the current implicit receiver arrangement favors economy of writing over what I'd describe as more explicit expression of intention. This may be a matter of taste, and I also hope it is clear that this may just amount to a personal opinion.

So, now the question becomes: is this assessment not reasonable? Is it unfair to state that Newspeak's implicit receiver grammar is designed with code size in mind? I think not, because Vassili's latest post points to a paper by Gilad in which the matter of implicit receiver design is explained. And as it turns out, the way in which implicit receivers are implemented has a lot to do with brevity.

Page 6 of Gilad's paper has a section entitled Unambiguous Interpretations, and in it there is a list of four choices for implicit receiver design. The first item describes the pros and cons of using Smalltalk's approach, which is to require an explicit receiver for every message send. This is said to solve the problem of ambiguity, however,

[...] Newspeak (unlike Smalltalk) is an entirely message-based language. It would be unduly burdensome to have to specify an explicit receiver for every message send in a language where everything is a message send. While mandating an explicit receiver may be a reasonable approach in some languages, it is not appropriate for Newspeak.

It is clear that this decision is at least influenced by the sake of being terse. Not that I dislike this per se, as I also try to write code that is not unnecessarily long. Nevertheless, my observation is that the use of implicit receivers (probably) makes determining the receiver of a message more difficult because now the first "word" in a "statement" has to be scanned to the end to see if it ends with $: or '::', thus allowing to decide whether it is a receiver or not.

It so happens to be that I prefer prefixes rather than suffixes in this case, and thus my concern with making the grammar more complex for the sake of writing less.

Now, of course a prefix/suffix problem also occurs in Smalltalk when one has to determine the meaning of a string of "words". Is it a unary message chain or a keyword message? However, at this point the receiver has already been read because its position is fixed in a sentence, so what is being seen past that can be assumed to be a message of some sort.

Thus, since my experience has been that attempting to aggresively optimize how programs are written so they are shorter is counterproductive in the long run, I tend not to like this as much at least at first sight.

But what is the definition of aggressively, and what constitutes too much of it? Good question. I wish I knew how to unequivocally quantify this in general, or at least for this case in particular. Alas, other than a somewhat mild suggestion to consider the consequences of terseness from the point of view of the 7 +/- 2 guideline, I do not have a good answer to offer.

Finally, I would like to make it clear that I do not have experience with Newspeak nor Self. Please point out any technical inaccuracies in the discussion above so I can fix them. Thanks in advance.

Wednesday, December 03, 2008

Class name size in large scale application

This is a data set I ended up not using, but I might as well share it in case it helps others. It's a set of code metrics for a large scale financial application (no, not that one... the data comes from XTrade). In this case, the information provided refers to the sizes of class names. Here are the results*.

  • 13318 classes.
  • Arithmetic mean name size:
  • Harmonic mean name size: 10.04.
* Full data in array format. An integer j at position k means class name k occurred j times. #(9 18 2370 1072 1708 1488 1333 1490 1862 1906 2053 2317 2429 2330 2456 2361 2399 2347 2206 2093 2041 1960 1924 1792 1564 1592 1393 1374 1160 1055 1001 957 840 761 748 615 634 563 467 448 421 396 318 306 289 231 246 206 187 156 168 141 115 96 83 85 75 70 60 61 40 44 38 44 27 36 32 22 21 12 17 11 15 13 10 6 7 7 4 4 5 3 3 5 3 2 0 3 1 0 1 0 0 1 0 0 0 2 0 2 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 2)

Tuesday, December 02, 2008

Keyword size in large scale application

Ah, here we go. Here are the results for a large scale financial application. No, not the one you are thinking about. This one has 12.5k classes and 750k LOC. The results are as follows*.

  1. 191466 keywords in total. Arithmetic mean keyword size 18.08, harmonic mean 12.05.
  2. 96490 keywords, about 50%, occur with these 12 most frequent sizes (in order): 11, 9, 13, 14, 12, 15, 16, 17, 10, 18, 5 and 19. For this subset, arithmetic mean keyword size 13.11, harmonic mean 11.64.
The glass seems half full on the side of more than 10 characters...

* Full data in array format. The array has 133 entries. The first one corresponds to size zero. #(1 45 349 4881 3125 7181 5133 5519 6140 9286 7472 9678 8331 8616 8406 8086 7807 7652 7255 6720 6290 5910 5397 4966 4517 3985 3806 3304 3075 2690 2401 2199 2073 1761 1660 1559 1297 1310 1144 978 945 883 782 705 652 614 514 523 414 379 330 341 283 259 206 200 181 158 135 135 118 99 92 83 77 42 43 37 29 31 15 20 15 21 14 13 7 8 8 5 4 6 3 3 6 3 2 0 3 1 1 1 0 0 1 0 0 0 2 0 2 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 2)

Comment moderation policy

It has come to my attention that at least one of you posted a comment for which I received no notification. Furthermore, I cannot find the comment provided, so I cannot approve it for moderation purposes.

However, the email I got regarding this missing comment was written under the assumption that I had rejected the comment during moderation. Clearly, that is the impression the apparent behavior of the blog gives. So I thought I'd outline the policy I use to moderate comments.

  1. comment isSpam ifTrue: [self reject: comment. ^self].
  2. comment isGrosslyOffensive ifTrue: [self reject: comment. ^self].
  3. self publish: comment.
Therefore, if you have not seen one of your comments, and you do not think it was spam or grossly offensive, then chances are I didn't even see it. Hopefully this makes things a bit more clear.

Monday, December 01, 2008

On implicit self, v2.1 --- size of keywords

Vassili commented that most of the time, things like keywords are small enough so that the $: (or '::') fits within the bounds of our fovea and so we recognize words at a glance as opposed to scanning them.

A word is taken in at once in one saccade, and a terminating colon is perceived immediately. You don't have to "scan to the end" to see if a word is an identifier or a keyword, you just see it. Even if you don't believe it. Exception are words too long to fit in the fovea (more than 10 characters or so for the usual reading conditions), but even so a colon falls in the right parafoveal area and is still recognizable thanks to its distinctive shape (some typefaces can make it easier or harder than others).

So I decided to run an experiment. Evaluating the following code (sorry for the nasty variable names)

(ByteSymbol allInstances collect: [:x | x keywords]) inject: Bag new into: [:t :x | x do: [:y | t add: y size]. t yourself]

and then digging through the bag's dictionary reveals the following.

  • There were 73408 keywords in 62107 symbols.
  • Sorting the keywords by size frequency (to see which keyword sizes occur more often) reveals that 36789 keywords (about 50% of the keywords in the following 11 sizes: 11, 10, 12, 14, 9, 13, 15, 16, 17, 5 and 8, in that order) have an arithmetic mean size of 11.85 characters. The corresponding harmonic mean keyword size is 10.63.
  • The arithmetic mean keyword size for all keywords is 16.94 characters. The corresponding harmonic mean keyword size is 10.79 characters.
So it seems that at least half of the time we're pushing keyword sizes past 10 characters. Even taking out outliers with the harmonic mean results in 10.79 characters per keyword*.

It seems reasonable to suspect that most of the words we use when programming do not fit the fovea (usual reading conditions)**. What are the implications for reading the code in which these words appear? What should be concluded here?

* In the sampled image. Your mileage may vary. I wish I could run the stats in large scale applications to see what numbers occur in the field. From my direct experience, I would not be surprised to see harmonic mean values of 15 or larger.

** Full data in array format. An integer j at index k says keyword size k occurred j times. All sizes from 1 to 78 occurred at least once. Consequently, the array has 78 integers. #(261 388 2112 1389 2992 2210 2536 2944 3471 3561 3756 3536 3410 3510 3372 3185 3052 2740 2436 2423 2016 1733 1652 1319 1154 1050 855 842 721 631 607 576 562 547 468 445 466 439 382 381 352 323 334 266 277 238 210 181 182 136 122 112 84 65 39 54 35 32 33 35 27 16 16 15 9 16 15 12 7 6 11 6 2 4 2 1 2 1)

Sunday, November 30, 2008

On implicit self, v2.0

Vassili Bykov posted an answer to my previous post regarding the use of implicit self in his blog. So, I will answer to his answer in my blog. I usually reread what I write and make corrections, so I have a feeling this post will be edited heavily. Please make sure you get the latest version.

Regarding implicit self in Newspeak, Vassili writes:

As far Newspeak is concerned, what we have is not “implicit self”, and its purpose is not saving keystrokes. What Newspeak has are implicit receivers. Because of class nesting, a message with an implicit receiver may really be sent to an object different from the “real” self (the receiver of the current message). This feature is very important in supporting the minimalist module system of Newspeak. Thus, an implicit receiver is not simply an omitted self, and inserting “self” into a message send with an implicit receiver is not a behavior-preserving transformation.

Ok, I did not understand implicit receivers in Newspeak and therefore what I wrote regarding implicit self does not apply to it. However, how does the argument for consistency to which Vassili replied with the text above, namely this (which I wrote),

I find that the consistency offered by a few keystrokes makes it easier for me to read and understand code faster and more accurately. Therefore, since we read code much more often than we write it, I think that favoring reading speed over typing speed is the right decision to make.

apply to implicit receivers in general? Doesn't the fact that the receiver is implicit become confusing over time? I think that, as Vassili says near the end of his post, perhaps experience will tell and unfortunately there is not a whole lot of it yet. On the other hand, maybe an example of what Vassili is referring to is in order.

The next piece from my post that Vassili quotes is this.

I’d rather see self than having to assume it by scanning the first token until the first occurrence of $: (or ‘::’) to only then be able to disambiguate between a receiver and a keyword.

In short: I prefer the work of my internal parser to be made easier by the use of prefixes, rather than to have to keep a stack that only goes down in the presence of a suffix.

Vassili's answer is the following.

This aurgmnet is falwed for the smiple raeson that our percpetion dose’nt wrok this way. We do’nt hvae an intarenl parser. What we rellay hvae culod be desrcbeid as a comlepx adpative, preditcive and bakcptaching paettrn recogznier. This is why we can still read the above even though most of the words are messed up.

Well, there is something worth noting. All the words that are mispelled have something in common: their first letter, a prefix, is always correct!

I've always felt disappointed when VW's spell checker tries to offer suggestions assuming the first letter is not at fault, and that the first letter is always present. When that does not happen, lewl, higstn moceeb rmoe tlufcidif ot dratsnedun. Tralinen erspra ro ton, het klac fo trecocr gliaden sortacindi edos esem ot ucaes tiandadiol sharphid*.

Jokes aside, I would suggest that although our idea of parser may be limited as implemented in a computer, we are quite able to draw distinctions on text based on a number of criteria. My observation was simply a matter of my personal preference.

I prefer the work of my internal parser to be made easier by the use of prefixes, rather than to have to keep a stack that only goes down in the presence of a suffix.

I still think it's worthy of consideration though, particularly because I am not sure Vassili's argument holds in every case:

We don’t scan the text linearly one character and one token at a time. Words are pictures, not character arrays.

Sure, however sentences are read left to right and precedence of certain words does matter. My thing with $: (or '::') is that they are a suffix added on to a word, and that the presence of this suffix has the ability to change the sentence being read quite strongly: it controls whether the first word is a receiver or not.

To put it differently, when it comes to sentences that represent a message send, perhaps the issue here is that Smalltalk basically imposed that the most important thing is the receiver, and that is why it comes first in Smalltalk sentences. In fact, since receivers come first, it is not necessary to mark them with suffixes or anything else.

But in Newspeak this is not so. I was of the impression that implicit self (in Self) or receivers (in Newspeak) were a matter of economy of typing. To some extent I get the same impression from Vassili's comment here:

On the other hand, there are situations when they improve readability by eliminating noise. A good example are DSLs embedded in Newspeak. So far we have two such languages widely used in the system: Gilad’s parser combinators and my UI combinators in Hopscotch. The feature common to both are definitions written in a declarative style combining smaller things into larger ones. Compare an example of such a definition the way it’s commonly written:
heading: (
row: {
image: fileIcon.
label: fileName.
[column: folderContents]

with the same definition with explicit receivers:

self heading: (
self row: {
self image: self fileIcon.
self label: self fileName.
[self column: self folderContents]

The first example has nothing but the structure it defines. It’s important what the expressions say. The fact that they are message sends is an implementation detail. The second example leaks this implementation, and it takes some effort to see what it really says in between all the “self”s.

The effect I can't help seeing though is that the receiver appears to stop being the most important element of a sentence, so much so that sometimes it is implicit and it is not equivalent to self --- even though in the code above the implicit receiver is self.

How does Newspeak disambiguate between an implicit receiver of "self" and some other implicit receiver? Is the disambiguation expense cheap? Perhaps part of the answer is in Vassili's comment:

Those left unconvinced should also consider that modern IDEs, Newspeak’s Hospcotch included, do the parsing for you by colorizing the source code.

However, I find this particular argument unconvincing because, even though I did work on more than one project that had syntax coloring, I found it most useful when the code was convoluted. So is the coloring good because of itself, or does it become valuable when there are other things to consider such as the inherent entropy of each symbol being read?**

But I digress. Personally, I would prefer the receiver to always be explicit, or at least the indication of whether there is a receiver first or not be a prefix, but what can I say... that's my biased preference today. I do not have a good record: 13 years ago I thought that programming assembler on my 386 was the greatest thing since sliced bread, and yet here I am writing books about Smalltalk... nevertheless, I hope that this is not seen in these terms:

There’s much to be said about the human nature and the tendency to instinctively resist change to something familiar while trying to rationalize that resistance.

Resistance to what? I am not observing the alleged change in my environment, so I cannot possibly be resisting it. The more interesting bit though is this.

It takes some time and experimenting to see a change for what it is and get a feel of the new tradeoffs.

So, I also hope that it is clear that some of the tradeoffs seen in Newspeak seem a bit strange to me at first sight. Not wrong, not incorrect, nor anything like that. Just not something I'd naturally think of today because my preferences are currently somewhere else, that's all.

Now, Gilad says in his presentations that one of the goals of Newspeak is to improve what was achieved with Smalltalk (and other languages such as Self). Well... perhaps the arguments are a bit too long to fit in 45 minutes or 2 hours, and so the essence behind them is missed. However, using implicit receivers for the sake of modularity (and to type less as a side effect)... it just makes me curious. What other alternatives were considered? What tradeoffs were attractive for this one as compared to the ones that were discarded?

To summarize: I think explicit receivers are better because sentences are less ambiguous and because a key distinction of a sentence, the receiver of the message, it always present in the same place plus it seems fitting that it comes first due to its importance. On the other hand, Newspeak's use of implicit receivers has the advantage of making it easier to implement a minimalist modularity scheme, and as a side effect you type considerably less in some cases.

Is that a fair assessment? Where do we go from here?

*: When that does not happen, well, things become more difficult to understand. Internal parser or not, the lack of correct leading indicators does seem to cause additional hardship.

**: Now talking exclusively about Smalltalk for a moment: if coloring is there and I can manage the namespace of a rather complex method better, does that end up helping me? Or does it simply make it easier for poorly written code to live on, thus making syntax coloring necessary and apparently useful? If methods are short and no more than 5 lines long, like we always say they should be, do we really need syntax coloring? Would we even care much about formatting? Which one is the egg and which one is the chicken?

And now talking about C: coloring really helps, but I think the existence of large files with lots of code and little to no visual cues as to where the boundaries between each of the pieces are is what makes coloring helpful in the first place. Nevertheless, I'd rather have a browser.

On implicit self

I didn't quite like the implicit self in Self and Newspeak, but I couldn't quite point my finger on why. I just realized that the argument can be made quite concisely.

The implicit self is a special case of the grammar of the language which does not need to be there. In languages that strongly rely on message sends as the mechanism to allow behavior to occur, message sends should be expressed as unequivocally as possible. At least from my Smalltalk-biased POV, the fact that all sentences are of the form

receiver message: withArguments

is a benefit because it clearly states what the receiver is. As such, I'd rather see self than having to assume it by scanning the first token until the first occurrence of $: (or '::') to only then be able to disambiguate between a receiver and a keyword.

In short: I prefer the work of my internal parser to be made easier by the use of prefixes, rather than to have to keep a stack that only goes down in the presence of a suffix.

Similar arguments dictate my preference to write [nil] instead of [], and to add ^self at the end of empty methods rather than seeing a "blank" text pane.

I find that the consistency offered by a few keystrokes makes it easier for me to read and understand code faster and more accurately. Therefore, since we read code much more often than we write it, I think that favoring reading speed over typing speed is the right decision to make.

Saturday, November 29, 2008

Looking for the source of a quote

Again, I must turn to the human Google for this one. What is the source of this quote?

In most languages where inheritance is singular, it's a card you only get to play once, so you'd better play it wisely.

Thanks in advance!

Friday, November 28, 2008

About the Fundamentals book

I just ran a page estimate for the Fundamentals book. It just reached 206 pages. Chapter 4 looks like it needs another 20. The rest of the material, from chapter 5 (on polymorphism) to chapter 9 (on optimization) seems to be enough to push the page count close to 800.

This is a huge problem for two reasons. First, Lulu only binds volumes of up to 740 pages. Second, it leaves zero room to write about Assessments!

So the plan of action is as follows: split off the material on Assessments into its own volume, and hope that I can shoehorn all the other material into 740 pages or less.

I feel better already. Eliminating the divisions that were cutting the Fundamentals book into two parts, one for the techniques and one for Assessments, gave me 4 more precious pages of space.

It seems that I have enough stuff to write books for the next 5 to 10 years now. Well, I better get going with it, or I won't be able to finish in time.

Smalltalks 2008 Coding Contest writeup

As in other years, this Smalltalk coding contest consisted of writing a program that would play a game. The ranking of each player was determined by comparing their corresponding scores in the game. The qualifier round, held before the conference, serves the purpose of bringing all the contestants to more or less the same level of proficiency, i.e.: they all have a program that plays and obtains some score. The finals changes the rules of the game the participants have to play, but the nature of the changes is not communicated to them. It is up to those competing for a prize to determine what has changed, how to adapt their program to the perceived changes, and to do so under time pressure. The idea of having to make changes within something like two hours is that, if a program is well designed, then changing it will not require an extraordinary amount of time and effort. If, on the other hand, a participant comes to the final round with a program that is too tightly coupled with the problem at hand, then changing it will be costly. It is assumed that since business requirements change continuously in real life, this way of measuring the quality of a contestant's submission is appropriate.

The problem for the Smalltalks 2008 Coding Contest was again to play a game. The particular game was defined in terms of what happens in a software development team. In other words, participants would have to create a program that would behave like a software developer in a game in which an application is being built. Some of the factors that contestants had to keep in mind are things that we all know very well. For example, a little stress is a good thing because it may give us a sense of urgency that may allow us to finish our tasks quicker. On the other hand, too much stress will eat into our productivity, and no matter how much we work we will not make much progress.

Moreover, interaction with other team members is a key factor for a project's success. Because of this, the game gave an advantage to those players that collaborate with each other by making it less likely that completed work units would cause bugs when integrated into the final delivery. However, this is not so easy to achieve all the time. It is for this reason that the game came with six autonomous players with different personalities, in order to stress the capabilities and flexibility of each of the competing programs. The autonomous players were played by the game server itself, and so the contestants did not have control over who they had to play with in a particular game.

In order to make it interesting and politically correct, the autonomous players were modeled after six Dilbert characters: Dilbert, Alice, Asok, Dogbert, Ratbert and Wally (which finally explains this earlier post). The implementation of the game itself was rich enough so that it was possible to implement a single strategy shared by all these different characters. What made Dilbert different from Dogbert was a personality object that provided 17 tuning parameters for aspects such as the amount of stress the player would tolerate, how much work it would be willing to accept at any one time, and so on.

Something that was also modeled was how quickly players would behave in a counterproductive way as a means to retaliate for their perception of lack of collaboration by other players. For example, Dogbert is not usually very willing to help others. Since it is in his personality to put himself first, he may decide to simply accept work from others to just delete it and thus get rid of it. He does not care about the consequences of this, because he will delete this work as long as the boss does not assign it to him in the first place. What happens then is that if Dogbert deletes work sent by Alice, the boss will complain to Alice for letting work drop on the floor. This increases Alice's stress, so it is in her best interest to note that the behavior complaint was related to a work unit sent to Dogbert. In turn, this will make it less likely for Alice to ask Dogbert for help later on in the project.

Contestants had to deal with this counterproductive behavior as well. However, they were not told which player had which personality. All they saw were developers called names such as Smith, Jones, or Taylor. Part of the challenge was to make programs that would learn from their experience as projects progressed to completion.

You can see the official rules and qualifier server for the contest by selecting the coding contest section on the left here.

The winners of this year's contest were Guillermo Amaral and Guido Chari. The second best score was obtained by Hernan Wilkinson, one of the organizers. Unfortunately for Hernan, he was barred from receiving prizes at the finals. Diego Geffner finished in third place at the final round. Prizes included an iPod Touch for each of the winners, courtesy of Instantiations and Caesar Systems. Diego Geffner obtained an MP4 player courtesy of GeoAgris. Snoop Consulting also provided bookstore gift cards.

Wednesday, November 26, 2008

Some sensible observations regarding multicore CPUs

Finally somebody calls never ending exponential growth for what it is. Now we should recognize that exponential growth on core count is just as doomed as the GHz amount and many other things (see also here).

One way or the other, the future is not yet clear.

Tuesday, November 25, 2008

Smalltalks 2008 makes the newspaper

The Smalltalks 2008 conference was mentioned on the newspaper La Razón. Here is the pdf with the scanned page. The caption on the photo states there was a lot of public at the conference :).

Saturday, November 22, 2008

Smalltalks 2008 photos

Here are some photos I took while at the conference. Also, here's an album of photos taken at the social dinner event (courtesy of James Foster).

You do not want to miss out next year now, do you?

First prize of the Smalltalks 2008 Coding Contest

Well, Instantiations had generously provided us with an Ipod Touch for the Smalltalks 2008 Coding Contest first prize. But then we had the situation in which a pair won the finals, so more than an Ipod to share, it became an Ipod to divide. Nevertheless, another one of our sponsors came to the rescue: CaesarSystems will provide a second Ipod Touch to the winner pair.

Thank you Victor Koosh and CaesarSystems!

Sunday, November 16, 2008

Smalltalks 2008 video: A Reflective Reporting Tool

Gabriel Cotelli posted footage of his talk at Smalltalks 2008 here. Enjoy!

Saturday, November 15, 2008

Smalltalks 2008, Saturday notes

I just got back from the conference's last day. Here are the notes for today.

First, I did my presentation on the implementation of the Coding Contest. It had been quite a while since I wanted to tell that the numerical model behind all this work, including how to model the behavior of Dilbert, Alice, Asok, Dogbert, Ratbert and Wally was...

... y = arctan(x)...

This function is used to model the progress of work, the quality of the perception of how much a work unit is done, the inspiration of programmers, the stress of programmers, the irritation of programmers... everything. I even found out later that it has been used to model characteristics of bipolar behavior disorder.

But well, enough of that. The code will be made available shortly, and so you will be able to play with it. After that came Leandro Caniglia's presentation on instance behavior. I missed most of it because I had to leave the room after my presentation to talk to some people, so I cannot comment much on it. However, I did see that he got lots of questions regarding the applications, which shows there was plenty of interest.

After the break came a presentation by Gabriel Honoré, who wrote a Commodore 64 emulator in VisualWorks. Not only it works --- it runs at 100% on a Core 2 Duo @ 2GHz (his machine). In fact, not only it works... it works correctly!!! He played some games on the screen, he even brought up the game Truco (for an explanation of how to play the game, see here). Yep, the one that uses the SID chip for speech synthesis. And it worked, and sounded, perfectly. At this point, Gabriel decided to open inspectors on the components of the C64. So he brought up the VIC-II video chip and with a simple message send (IIRC, self color: 6), he changed the border of the screen to being green while the game continued to run. A suggestion to spy on the computer's cards was made. At least in my opinion, the emulator was so good that the emulated C64 even resetted itself faithfully... the particular way in which the video chip behaved while the machine rebooted was reproduced correctly on the screen as far as I could tell. Gabriel commented the code will be available, in free form, at the Cincom public Store repository in a few months.

Finally, it was the turn of Gerardo Richarte to show SqueakNOS. After booting from an USB device, he went on to show how hardware devices are programmed in simple terms. Although I do not remember the exact figures for each, Smalltalk device drivers for things like network cards, the mouse, the keyboard and so on were at most 100 methods and 300 LOC. Most methods were one liners. Hardware interrupts are served by the image. And the slides for the presentation ran in a SqueakNOS image running with no OS under it. What is more, at the end of the presentation Gerardo told us he was going to write a hardware device driver for the IDE hard drive controller at port 0x1F0 and read data from the hard drive. In about 5 minutes he was done typing something like 16 accessor like methods, and then invoked a read for sector 0 from drive 0 head 0 with command 0x20. 256 16 bit shorts came back, and the last 2 bytes were 0xAA55, what is expected of a boot sector.

Something like 15 years ago I wrote a program to detect and mark bad and near bad clusters on FAT partitions. I did that in Pascal, and I can't tell you the trouble I had to go through to get that working right. Here, we have a device driver for the controller written in 5 minutes...

Then came the closing ceremony. While the best talk votes were tallied, we ran a prize draw amongst the registered people still present. Lo and behold the first person that came up was my sister! Sure enough, she was registered, but this was a ~1/200 chance... oh well, so we skipped her (besides she wasn't there at the moment) and continued on. In this way we handed out 9 books: 3 books given by Pragma (one of our sponsors), and a set of 6 of my books (3 hashing books, 3 mentoring course books). As I stated while at the conference, I'd like to thank ESUG for purchashing several of my books for their conference in Amsterdam.

Then came the prizes for the coding contest. Guillermo Amaral and Guido Chari claimed the Ipod Touch given by Instantiations, and Diego Geffner claimed the MP4 given by GeoAgris. Since the winners would have had to divide (I mean share) the Ipod Touch, the winner pair also claimed some of the bookstore gift certificates given by Snoop Consulting.

Finally came the prizes to the best two talks of the conference. In second place came SqueakNOS by Gerardo Richarte, which was awarded half of the remaining bookstore gift certificates. In first place came Gabriel Honoré's Commodore 64 emulator, and he received the remainder of the bookstore gift certificates plus the original August 1981 Byte magazine donated by Diego Roig-Seigneur.

Well, we in the organization committee think this year's conference went quite well, but we are also sure there are things to improve. If you have any comments to make, please send them our way at smalltalks2008 at gmail dot com. We look forward to hearing from you.

See you next year at Smalltalks 2009!

Smalltalks 2008, Friday notes

Here are the notes for Friday at the conference. The day began with Hernán Wilkinson's Key Design Decisions presentation, in which he provocatively addressed a number of issues we are all very familiar with. He made the case for immutability of domain objects, full initialization of objects before they can be used (so e.g.: by the instance creation method as opposed to by the users of the class), and a number of others. This made such an impact that the presentation was heavily discussed over lunch.

Then we saw Claudio Acciaresi and Nicolás Butarelli's work on a thorough refactoring of the Collection hierarchy using Traits. It is interesting that while they saw several advantages to this (such as the elimination of code duplication and the possibility to create more diverse collection classes easily), in the end they commented it was not a slam dunk as Traits do come with their own complexity.

After the break Carlos Ferro showed how ExpertCare (as initially described by Dan Rozenfarb on Thursday) manages to make good question suggestions for telephone operators receiving health related phone calls. For example, it would be good if the system helped determine when to send an ambulance because of an emergency in as few questions as possible. It is not obvious how to do this because, as soon as one examines symptoms, the body systems they affect, and the syndroms they may imply, choices are not clear cut much less evident. Nevertheless, the strategies shown by Carlos allowed ExpertCare to detect an emergency in a median number of 1 question, with a maximum of slightly over 2 questions on average.

Then came Guillermo Amaral's talk on percolation. He did not just implement a few algorithms. Rather, he built a tool to model solids as a lattice of points, connected by arbitrary edge patterns, and then used several algorithms and procedures to determine the probability with which the material thus defined would allow liquids to pass through. Most impressive. To begin with the tool was graphical and included visual representations of the lattices, the connecting edges, the probability graphs (including choosing the color of the curves and graph combination)... a lot of serious work which led to the verification of possibly original conjectures in certain scenarios.

After lunch we had a persistency block. We started with Esteban Lorenzano, Mariano M. Peck and Germán Palacios' talk on SqueakDBX, an interface to the open source database library OpenDBX. The idea of OpenDBX is to allow access to a multitude of relational databases via a common interface. SqueakDBX is the Squeak interface to OpenDBX, and as such Squeak can now talk to Oracle, My Sql, Postgresql, etc etc etc. This works so well that e.g.: SqueakDBX is faster than Squeak's own native driver for Postgresql.

Something that will be added to SqueakDBX is support for Glorp, which was quite nice because the presentation naturally blended with Alan Knight's talk on Glorp. We saw many of the features that make Glorp nice. For example, the mapping model allows to map objects to rows, or to inline objects in the row of another object (e.g.: for speed), or to save an object across many tables. Glorp can query this by examining blocks such as [:person | person name = 'Alan']. Much more complicated block expressions are possible. On top of that, Alan described Hyperactive Records, which are used in Cincom's WebVelocity product.

After that, I ran the Coding Contest's final round. It went very well because this time, unlike last year's, I didn't have trouble with having multiple active HTTP servers in the same image. Maybe it's because the lesson I learned in 2007 made me put in a number of automatic measures to prevent that from happening...

  • Image packaging stops any existing HTTP server forcefully. If after a GC there are still instances of them, image building fails.
  • The packaged image startup sequence kills any existing HTTP server forcefully again.
Also, I was quite happy that the participants did not find any bug in the contest. This makes the finals stressful for the organizer as well: basically the finals are a software release, and the thing has to work. If that means you get to fix the bug right then and there, too bad. Fortunately it went smoothly.

And the participants? Their reactions to the changes in the finals were varied and interesting. Some were getting positive scores within 15 minutes. One finally got a positive score in the last minute. One cursed in frustration :). The results are as follows.
  1. Guillermo Amaral and Guido Chari, with over 26 million points.
  2. Hernán Wilkinson, with over 2 million points.
  3. Diego Geffner, with no certificate.
Congratulations to all of them!... although note that Hernán Wilkinson cannot get a prize due to being in the Organization Committee :). Therefore, the 2nd prize will be awarded to Diego Geffner.

See you in a bit for the last day of the conference!

Thursday, November 13, 2008

Smalltalks 2008, Thursday notes

Whoa, it's been an almost 20 hour day already, and I had not been able to sleep last night anyway. Organizer jitters, most likely. Here are some notes on Thursday's happenings at Smalltalks 2008.

The conference's opening was again in the hands of Hernán Wilkinson. The main point was that this event happens because our community shares the enjoyment of doing what we do. This goes from the UAI offering the conference's venue, to the sponsors offering the prizes for the contest (the finals are tomorrow!), to Diego Roig-Seigneur donating an original August 1981 Byte magazine to be given to the best presentation in the conference. There were plenty of jokes, and I was the receiver of one :)... since I will be the referee in the coding contest's finals, I became the infamous William Boo! We shall see is justice is served on the final round now...

Monty's keynote showcased a complete list of successful applications (where successful is defined as 10+ years in production) written in Smalltalk that have a profound effect in our lives whether we are aware of them or not. Besides OOCL's container shipping application, Progressive's auto insurance rating, Adventa's chip manufacturing application, Key Technologies' food sorting machinery, Florida's power utility call center running in a state that goes through hurricanes every year and many others that I do not recall, there was again mention of one that I remember fondly... JP Morgan's Kapital.

I really enjoyed Dan Rozenfarb's talk on his expert system to handle patients at a medical call center. He went through many of the attempts that did not work, and that made the final achievements of e.g.: 99.3% correct seriousness evaluation all the more impressive (if the call is incorrectly assessed, then the software can suggest not to send an ambulance when one is definitely needed).

Next was a follow up on Zafiro by Andrés Poncelas. Zafiro is InfOil's application framework, which was presented at Smalltalks 2007. With the new improvements, InfOil uses Zafiro as a means to easily express domain objects and their relationships in their applications, which are used to manage a significant fraction of all the oil and gas produced in Argentina.

After lunch, we saw Gabriela Arévalo's presentation on Moose, an application designed to enhance the way in which developers can obtain a high level view of software they do not yet know intimately. It was interesting to see how the 7 +/- 2 rule applies everywhere, even to the diagrams Moose produces --- for example, one can use colors in Moose to represent different metrics obtained from the code, but after 5 colors they become difficult to understand because it is difficult to concentrate on so many colors at the same time.

Gabriel Cotelli showed Mercap's reflexive report tool, which is used in XTrade to allow power users to produce ad-hoc reports in a controlled way and without having to write Smalltalk scripts.

Bruno Brassesco came from Uruguay to show how he used Dolphin to deal with a really obtuse XML, WebServices, .NET and C# development environment that produced applications for banks in Central America. Basically the problem they were having was that they had to use a C# framework that executed a WebServices stack. The WebServices stack was structured in 3 layers (the presentation layer, the business layer, and the system primitives layer) on top of a set of .NET libraries that called the back end. Each layer of the WebServices stack called services in the same layer or a lower layer, starting from an original invocation from HTML. Eventually the back end was called and the results were transformed via XML transformation rules until the last transformation produced HTML from XML. The issue came when there was a problem somewhere. Let's say we know the presentation level service didn't finish. So they would insert debug steps before and after each service invocation to determine which service failed in the presentation layer. The debugger action of these debug steps? Send the developer an email.

Yes, you read that right.

No, I am not kidding.

So if you were debugging a web service with 3 service sub invocations, you'd add 4 email sending debug steps. Let's say you find that service number 2 is broken (because you only get two emails: before step 1, before step 2, and then the thing crashes). That's great because now you have to go to the business model XML file, find the service definition for service number 2, and add more email debugging steps on that to see where the problem was.

Eventually you have 3 huge XML files open, with lots of copy/paste going on, missing service definitions, unsent services, broken XML, etc etc etc. Egad. Bruno showed us a definition of a service with well in excess of 100 sub invocations. And each time you make a change, you have to kill the server, recompile all the files, upload the files, restart the whole thing, and try your test case again by hand hoping you'll be able to reproduce the failure. And, oh by the way, without proper file locking in an environment with 150-200 developers.

What he did was to use Dolphin to create an IDE for all this mess. Since the technology could not be changed, then at least he was able to make work far more bearable. His IDE brought things like senders and implementors to the XML files. Automatic email probe management. Detection of problems before they actually happened, like missing service declarations, broken files, etc. It was a sorry state of affairs for those in the development project... attrition was horrendous and developers lasted 1.5 projects. However, the mere insanity of the development process they had been forced to use had us laughing out loud many times during Bruno's presentation.

Then, Fernando Olivero and Juan Matías Burella showed us the master thesis work they are preparing: using Croquet to assist in teaching object oriented programming. To do this, they have designed a language which is actually a subset of Self. They plan to introduce programming students to this first, to then progressively introducing additional concepts and techniques ending with students learning something like Smalltalk. Most interesting!

After the last break, we saw roadmap presentations for Cincom and GemStone, given by Alan Knight and James Foster respectively. Not everybody is aware of the fact that Alan is an actual soccer referee, and since he knows his soccer he added several photos and videos to his presentation --- one such video played in an ActiveX control embedded via an innocent looking windowSpec inside a VisualWorks window!

James Foster followed on with an impeccable roadmap presentation including several demos of working technology such as Seaside and a new scaffolding framework (see here). I write impeccable quite purposefully, as GemStone presentations are always flawless. This, despite the fact that there was a nasty snafu and James' laptop ended up being left behind in the US by accident! But no worries: with little to no time to recover, there was essentially no evidence that anything had gone wrong. Such nice guys, these GemStone folks :).

After that, we had dinner in San Telmo and now finally this long day has come to an end. To be continued tomorrow...

Wednesday, November 12, 2008

Round numbers

Well, we're just a few hours away from the Smalltalks 2008 conference, and we have reached 230 registrations. This is a good number (sure, just as 10, haha). See you tomorrow!

Tuesday, November 11, 2008

Smalltalks 2008 coming up

The Smalltalks 2008 conference will begin this Thursday. The schedule is packed with high quality talks and very interesting topics. Just as last year's, the response from the community has been amazing. We're above 220 registrations now, and they are still coming in.

One thing that is different from last year is that registration is needed to guarantee entry. Registration at the conference's website is now closed, so if you have not registered yet, please do so by sending an email at smalltalks2008 at gmail dot com. The rest of the information, such as the schedule and the information for the social dinner event on Thursday night, is available from the conference's website.

See you on Thursday!

Sunday, November 09, 2008

Smalltalks 2008 Coding Contest Qualifier Round Deadline

The qualifier round of the Smalltalks 2008 Coding Contest, with final round prizes including an iPod Touch, ends in a little less than 10 hours. If you have not done so yet, this would be a good time to send your score certificate to the contest's mailing list.

Good luck!

Tuesday, November 04, 2008

Comment about design

Recently I saw this post that says that the ctrl+3 shortcut in Eclipse is a display of its good design. All I saw was some drop down with options in a screenshot. Since I could not leave a comment (the thing didn't work), I am commenting here.

It is not enough to say "this is good". One also has to be able to say why. For example, why is it that ctrl+3 is good design? It is good compared to what?

Without any rationale such propositions become unfalsifiable, and then it is not rational discourse anymore.

So my friend, could anyone tell me why ctrl+3 resulting in that screenshot is an example of good design? Maybe there's something to learn, and it will be easier to do that if it is spelled out explicitly.

Friday, October 31, 2008

Smalltalks 2008 Social Dinner Event

We will have a social dinner event during the Smalltalks 2008 conference. Since we will be making a reservation, it would be best if you add your name to the list. The menu will most likely include meat, although we will make sure the place is also vegetarian friendly.

See you there!

Wednesday, October 29, 2008

Smalltalks 2008 organization committee interview

Club Smalltalk just published an interview to the organizing committee behind the Smalltalks 2008 conference. Enjoy!

Tuesday, October 28, 2008

Nuevos precios para los libros en Argentina

Bien, Lulu acaba de subir los precios de mis libros, y ahora estan a $45 dolares cada uno. Por lo tanto, el nuevo precio para los libros que se compren desde Argentina segun la oferta anunciada anteriormente es $29 dolares (pagando en dolares, no en pesos).

Tuesday, October 21, 2008

Another improvement for k-nucleotide

I had been thinking about k-nucleotide, one of the benchmarks in the Computer Language Shootout. Finally I couldn't stand it anymore and I had to see if I could make it go faster.

First I tried replacing the dictionary with a finite state model. Nice idea, and nice code too. But, alas, quite slower because now basically all the objects created were dumped into old space. Before, a scavenge would just get rid of the unnecessary strings. And also, objects use oop pointers instead of byte strings using bytes, and so strings use new space more efficiently. Grrr...

Then I thought perhaps I could cause less GC activity. I did that by shamelessly reusing a buffer for the frequency counter. Then I looked at the time profiler and squished all the hashed collection growth away by presizing the hashed collections properly.

Finally, I also checked out VisualWorks' string hash function performance with DNA sequences. The behavior was very good indeed, with proper chi^2 mod p values as the sample size increased.

The bottom line is that the new k-nucleotide program runs 38% faster than the old one. Take that! :)

Update: alas, it only improved by 10% with the official load...

Sunday, October 19, 2008

This weekend

Sigh, my writing is still rusty in that I get tired after not too much time. However, production of new pages continues, and I do like that what does come out is pretty much in a finalized state.

This weekend I have written 14 new beautiful pages so far, and so the Fundamentals book draft is 196 pages long. I still have two subsections to go before I finish chapter 4 (On Inheritance), which is 75 pages right now. It looks like it should be 90-100 pages by the time I am done with it.

And then, chapter 5: On Polymorphism. Now I am worried. If inheritance merits ~100 pages, what will happen with polymorphism, one of my favorite techniques? Hmmm...

Sunday, October 12, 2008

Writing again

So now that some of my other projects have finally exited my work queue, I have time for writing. Today I wrote 6 new pages in about an hour. It feels good to see that I am not too rusty, given that I have not written since May or so. The fundamentals book draft is now 182 pages.

Friday, October 10, 2008

A game of chicken

It seems to me today's economic woes are the result of a game of chicken gone bad. In the first place, let's make the assumption that money represents a trust that an obligation will be repaid in the future. It follows that the value of money in circulation represents our expectation of what will be our future output.

Usually this value is represented well by the amount of money flowing around. This is because with sane currencies, you hardly ever have to worry about 10% depreciation happening overnight, for example.

So let's say investor A thinks the future output will be higher than what he perceives is the predicted future output. So he invests in things, thus using money that has to be replenished. To some extent, this causes demand for more money. So, by his actions, investor A makes his prophecy a reality.

Investor B sees that and thinks that he cannot be possibly outdone. He also invests, and predicts future output higher than that of investor A. So now investor A is underinvested, and an arbitrage process keeps everybody on the same page.

Now, as long as everybody is investing, they cause more demand for money. How much is this demand? Well, by how much do the investors want to see growth? Invariably, the return is expressed as a minimum yearly percentage growth target. This is an exponential function, and therefore within a relatively short period of time their prediction will effectively outstrip future output.

Then we enter fantasy land. Because tell me my friend, when everybody is investing, do you have it in you to pull out? Few people do. The rest play a game of chicken, with ever rising stakes.

Eventually they are forced into the realization that it's impossible to keep investing, and they all crash.

So now we see the reverse game being played. Investor A sells, which leaves investor B overinvested. Investor B sells, leaving investor C overinvested, etc.

If we think the current market behavior is nonsense, so is the growing phase that precedes the crash. The issue is our desire for exponential growth. If we do not stop this ridiculous expectation, then today's problems will happen again.

PS: of course this was my blog's post number 666.

Thursday, October 09, 2008

Novedades para Argentina

Bien, hace unos meses dije que estaba probando un mecanismo para enviar mis libros a Argentina a un precio reducido. Tardo mas de lo que pensaba, pero hace unos dias recibi la confirmacion de que el procedimiento funciona.

Lo que hay que hacer es enviarme el precio reducido a mi cuenta de paypal:

Tambien necesito la direccion postal completa a donde mandar el envio, incluido el nombre entero del destinatario. Ya que no puedo hacer correcciones ni controlar que este todo bien, estos datos tienen que estar exactamente como deben ir en el paquete.

El precio de los dos libros que hoy se venden por 40 dolares + shipping es, unicamente para Argentina, $25 dolares. Esta cifra incluye el costo del envio.

Al hacer el envio de dinero por Paypal, consideren que Paypal cobra 3% si el pago es en la misma moneda, y ademas algo asi como 5% si hay que hacer conversiones como por ejemplo de pesos a dolares. Por lo tanto,
  • Pagando en dolares: mandar $25.75 dolares.
  • Pagando en pesos: mandar la cantidad que corresponda de pesos segun el tipo de cambio del dia, mas un 8.15%. Desde ya, se recomienda pagar en dolares...
En estos dias Lulu va a ajustar los costos de impresion, con lo cual mas o menos el 28 de Octubre quiza los precios suban un poco. En principio Lulu va a aumentar los precios actuales automaticamente, de acuerdo a como cambie el costo de impresion, con lo que no tengo que hacer nada de mi parte para mantener las cosas como estan. Cuando esto suceda, sin embargo, voy a ajustar el precio para Argentina de un modo proporcional.

Espero que esto sirva como agradecimiento, dentro de lo posible, a la educacion gratuita que recibi en Argentina.

PD1: si la direccion no esta en Argentina, me reservo el derecho de devolver la plata y no enviar nada.

PD2: no ofrezco ninguna garantia de que esta oportunidad este disponible en el futuro, o para otros libros. Lo mejor es no dar por sentado que esto vaya a ser asi para siempre. Me reservo el derecho de terminar esta oferta en cualquier momento, y por cualquier o ninguna razon.

Tuesday, October 07, 2008

A thought on the side

Blogs, facebook, and all these sites that allow us to create a digital representation of ourselves can become our own socially acceptable tamagotchi.

Eww. Gross!

Monday, October 06, 2008

RBUIPainterWindowSpecTemplates 1.0

I just published an RB extension to the public Store repository. The package comment reads the following.

Without this extension, to add a windowSpec to an application model one has to either copy paste an existing windowSpec method, or go to the canvas painter and lookup the class already selected in the browser and so on to install a newly created canvas. So this adds an extension to the refactoring browser's class menu so that the RB can now add blank windowSpec templates to application models. Once the template is there, then one can go ahead and use the visual tab for application model window specs.


It is just an illusion

Can we stop our pain and suffering for something we never had and never existed in the first place? Apparently no.

  • On top of the $700B bailout, "the Fed signaled it could increase the amount available through those loans to $900 billion by the end of the year, increasing the amount the Fed will loan through the program by $750 billion above its previous limit". Gee, good thing we really needed Congress to act when the Fed can do whatever it pleases. It makes me feel proud.
  • This article begins to point the finger in the right direction when it puts this problem in terms of psychology, but fails to carry the argument to its last consequences. Why does it have to be a psychological factor? Because there is nothing tangible behind currency, that is why. At the very least, there is certainly not enough to back the economic activity we see today. So, since this does not resist formal analysis, let's pretend and feel good about it at least. Except of course that, as happens with other alkaloids, there is always that crappy feeling after a high. Well, there you have it. Our economic dopamine receptors themselves have become fully resistant to more money. Piling on higher overdoses does not address the issue.
But what do I know, right? Nevertheless, in the end, what one does not manage to understand by comprehension, one is forced to understand by suffering. It seems to me there is a lot of learning being forced down everybody's throat right now. If we only dared to look at things differently...

Meanwhile, here's a bit of a reality check.

Saturday, October 04, 2008

Assessments 1.6

I just added a spawning feature to the checklist evaluators, acting on a suggestion by Stefan Schmiedl. Now you can get evaluators to open on any subset of the checklist hierarchy. Enjoy!

Thursday, October 02, 2008

Book price advisory

Lulu just circulated a message detailing an upcoming change in their pricing structure. This is mostly about an adjustment to account for the increase in the cost of raw materials:

  • price of printing per page
  • cost of binding a book
Furthermore, there is the distinction between small and large books, for some page limit that was not immediately obvious to me.

While the new prices are still subject to modification, the changes will become effective on October 28th. That means that on or about this date, I may decide to adjust the book prices too in order to match the new printing costs.

Wednesday, October 01, 2008

Smalltalks 2008 Coding Contest has begun

Ok, the coding contest for Smalltalks 2008 has begun. You can get the problem description and the materials to participate from the conference's web page.

... did we mention we have rewards this year?...

That's right. Thanks to our sponsors, we have very interesting prizes.

  • 1st prize: Ipod Touch, courtesy of Instantiations.
  • 2nd prize: MP4/MP5 2GB Nexxtech K107, courtesy of GeoAgris.
  • 3rd prize: Gift card at Yenny Bookstores, courtesy of Snoop Consulting.
Enticing? Feeling a bit antsy perhaps? Then participate! You can play at the finals remotely this year, so really... there are no excuses. Do it!

Finally, the contest's mailing list is below.

Google Groups

Smalltalks 2008 Coding Contest

Visit this group

Tuesday, September 30, 2008

Smalltalks 2008 Coding Contest about to begin

In about 12 hours or so, the Smalltalks 2008 Coding Contest will become available at the Smalltalks 2008 conference web site. Good luck!

The new to do list

Some months ago I wrote that my to do list had STS 2008, then ESUG, then the conference in Argentina, plus Assessments and on top of that I had to add the coding contest for Smalltalks 2008.

I am happy to say that all those items have been taken care of. So now I have the following list of items...

  • Host the Smalltalks 2008 Conference, including the coding contest which starts tomorrow.
  • Prepare a new talk for Smalltalks 2008.
  • Schedule talks at UNLP while I am in Argentina.
  • Write books.
Now, on the part regarding writing books... what's in the queue?
  • The Fundamentals book draft is still at 170 or so pages. Clearly it needs to grow.
  • Start preparing the second edition of the hash book --- yes, thanks to the valuable work of some of my readers, there needs to be a second edition with even more material. This will probably take at least a year of calendar time.
  • I need to review the mentoring course book because apparently the baseline implementation of SUnit Based Validation has drifted from the one referenced in the book.
  • I need to start preparing a book about all this coding contest activity.
Finally, I should spend some time and get my ANSI Smalltalk SEPs going.

It never ends, really. But it's so much fun!

100% yearly gains, riiiight...

See in The Register: hard drive manufacturers said to want to stay on track with 100% yearly increase in aereal densities.

Of course, another exponential function like compound interest. Like either of them is sustainable now... but apparently it is not yet clear. So, how many years until we can store the Universe on a hard drive? Can we be realistic, please?

Monday, September 29, 2008

Assessments 1.5

This one has a fix so that trying to browse a bridged class opens the bridged class as opposed to the associated metaclass bridge class.

Also, the references to SUnitVM's base classes has been updated.

Assessments 1.4

The following changes have been made based on feedback from Stefan Schmiedl.

  • Overhauled how results are browsed from the result windows.
  • Added more tests for the checklist evaluators (to verify that the code from the previous item works).
  • Fixed an unintended feature where it was possible to browse the class of prerequisite failures from the result windows.
  • Bring back the [Refresh] button for checklist evaluators, while leaving automatic refresh enabled.
  • Improve how prerequisites print themselves.

Smalltalks 2008 Coding Contest Advisory

The Smalltalks 2008 Coding Contest starts in 48 hours.

Good luck!

No more $700 billion for you

Now that the bill failed, I have an idea of what might work. How about, instead of giving the bankers the $700 billion, we just go out to the people holding the bogus mortgages and use those $700 billion to make payments on them? Certainly that would drive up the value, no?

But then bankers benefit, and they should incur losses nonetheless. So here is a slight modification.

  1. We let individual houses go bankrupt, in a BAU manner.
  2. When the houses go through the repossession process, we insert a clause saying a certain fund has priority in buying the property at a fair market value. For example, it goes through auction, and if the sale price does not meet the reserve price, then the fund gets the property and issues a loan for the sale price to the people previously living in the house.
I am sure this needs some adjustment to cope with ill intent, but if we are going to spend $700 billion we do not have, we might as well spend it on ourselves. Eventually, that $700 billion will end up at a deposit window somewhere, so banks cannot really complain one way or the other.

Sunday, September 28, 2008

Another POV

So now we're set to give banks $700 billion. In exchange, we're getting paper not worth the $700 billion. But we do not have any savings account with $700 billion, so we will pay interest on that. To whom? If I understand things right, to the Federal Reserve, which is a private corporation the board of which is composed by the bankers receiving the $700 billion.

Sounds bad, right? And even if I was wrong and it is some other third party receiving the interest on the $700 billion, one thing is for sure. We simply cannot pay ourselves back, because we did not have any of that money to begin with.

Really. In this context, I go around and read quotes such as the following.

"We sent a message to Wall Street - the party is over"

"People have to know that this isn't about a bailout of Wall Street. It's a buy-in so we can turn our economy around"

"Nobody wants to have to support this bill, but it's a bill that we believe will avert the crisis that's out there"

"We begin with a very important task, a task to stabilize the markets, to protect all Americans - and do it in a way that protects the taxpayer to the maximum extent possible"

Is it me, or these quotes have nothing to do with the situation? Exactly who is calling the shots here? Let me quickly go over something. The people that owe money do exactly what the people that lent the money tell them to do. And it is us who owe tons of money. $700 billion is nothing compared to the national debt, so sorry, it is not us who get to say what happens.

No my friend. To me, these quotes above sound like what government types in Argentina would say. I know. I used to hear them all the time, and saw the consequences first hand. So, given that we know what happened to them...

... oh...

Land of the brave, sure. But the free? I do not think so.

Saturday, September 27, 2008

Assessments 1.3

I just fixed a problem where the results would get sorted over and over again in the results UI. A particular benchmark with 538 results went from 237 time profiler samples to 1 sample. Enjoy!

Assessments 1.2

Stefan removed the Refresh button in the checklist evaluator. Good riddance!

Assessments 1.1

Stefan Schmiedl contributed Announcement support for Assessments. Enjoy!

Wednesday, September 24, 2008

More contrast

Comparisons are great, check this out.

The original.

What could be said about it.

Really. All the copyright persecution for this???...

Monday, September 22, 2008

No more betas for Assessments

I now declare Assessments to have reached version 1.0.

Assessments 1.0 beta 17

Stefan asked for result windows to propagate evaluation result updates back to the evaluator windows. Done!

Sunday, September 21, 2008

About money and the current state of affairs

We cannot pay our loans if we do not produce enough goods of any kind that maintain the trust of the people from whom we borrow. Because we are so short sighted that we only look at quarterly profits and so on, what happens is that there is a strong motivation to have people consume. The issue should be obvious by now: consumers do not produce valuable goods.

In some extreme form or efficiency, one could have robots produce things of value that people would consume. But even then there is no escape because an ever increasing population with an ever increasing desire to obtain exponentially growing profits will be ultimately limited by the finite natural resources available to keep the machinery churning. In other words,

by stressing short term realization of profits, we are sacrificing our long term ability to produce things of value.

What are these things of value? Well, like, food and shelter for example. Or the knowledge and applied know how that will keep our standard of living above that of cavemen.

Really: who cares about the 700 billion, or even what is the number. The real problem is that we're living off the promise that the future will always be bigger and better, and that is what should be called into question.

Saturday, September 20, 2008

VW 7.6's hashing machinery strikes again

I am finishing the Smalltalks 2008 Coding Contest. Something I just did was to move it from my 7.4.1 images to a 7.6 image. In particular, something I was looking forward to was to make use of the revamped hashing machinery. And it did not disappoint.

A section of code that was running particularly slow on 7.4.1 runs ~2.5x faster on 7.6 without any code changes.

Now... where was I?... ah yes. More code to write.

Thursday, September 18, 2008

ReferenceFinder 1.4

Also, I just updated the ReferenceFinder so that it integrates well with Trippy (7.7x builds, but IIRC it should work on 7.6 too). Have fun!

Update: I added a few refactorings I had in my local image... version 1.5 now.

Hash Analysis Tool 3.24

I just fixed a leftover implementor of inspectorActions that was causing problems. Enjoy!

Assessments 1.0 beta 16

I just found a couple small problems in the SUnit execution policy management, and the fixes are now published in the public Store repository. Enjoy!

Computer Language Shootout, k-nucleotide

I just improved Eliot Miranda's earlier k-nucleotide submission to the Computer Language Shootout by changing it so that it uses the default VW 7.6 hashing mechanisms. With this smallest of changes, it runs 2x faster.

Also, I published a new bundle called ComputerLanguageShootout to the public store repository. Want to tackle another of the benchmarks? It seems to me it should not be too hard to improve the current benchmarks significantly.

Latest news on the Smalltalks 2008 conference

We would like to share the latest news about the Smalltalks 2008 conference.

1. We have opened the submission process for talks. The URL is here. The form can be found under the section "Talks". We are looking forward to hear about different types of presentations, whether they be industry, research or education related. The submission deadline is October 13th.

2. Furthermore, we have also opened the submission process for tutorials. The URL is the same as above, only the form is under the section "Tutorials". The deadline is also October 13th.

3. Finally, we would like to remind you that the coding contest rules and regulations, as well as the problem, will be published on October 1st. For more information check the section "Coding Contest" in the conference's web site.

We look forward to see you at the conference!
Smalltalks 2008 Organization Committee

Monday, September 15, 2008

FTP down for a bit...

Scheduled maintenance... it will be back up in a bit.

Update: back up now.

Rationalization of FTP accounts

I have just reorganized the Smalltalk related accounts on the FTP server. Now there is only one:

  • Server:
  • User: smalltalk
  • Password: now
Inside you will find a number of videos, a few papers, and the stuff I did for the 2006 and 2007 STS Coding Contests. This includes the original source code of my submission in 2006 (both qualifiers and finals), as well as the complete source code for the 2007 one I organized. These are direct applications of the pattern of perception so you can see it in action. Note that looking at the coding contest code too soon will invalidate the exercises of the mentoring course book's chapter 6. You have been warned :).

Sunday, September 14, 2008

Smalltalks 2008 Coding Contest getting close

Hello, my friend... so, here's the deal. If all goes well, the Smalltalks 2008 Coding Contest will begin on October 1st. As usual, there will be a qualifier round to get to the finals at the conference. However, this time things will be different.

  • Anybody that completes the qualifier round goes to the finals, and
  • Anybody can participate in the final round.
That's right, you don't even have to be at the conference to play at the finals.

Antsy to tackle a 100% original problem? Stay tuned...

Saturday, September 13, 2008

Train drivers do not "fail to heed a stop signal"

There was a recent train crash in Los Angeles. A report just came out, and it claims that the reason for the accident is human error:

Friday's two-train collision killed 25 people and injured more than 130 others near Los Angeles after an engineer failed to heed a stop signal, a spokeswoman for Metrolink commuter trains said.

Ok, hold on now. Trains, or train conductors, do not "fail to heed a stop signal" and just run past red lights. For example, in subways, signals have something like a ski next to them on the ground by the track. When the signal is red, the ski raises. If the train runs past the red signal, then the ski pushes a lever on the train, and this forcefully activates the brakes thus stopping the formation. In other words, stop signals are driver-failure safe.

You can see this in action at NYC's and NJ's subways (and in Japan, and in Spain). The subways in Buenos Aires have the same feature. Ever wonder why no driver ever dares to run past a red signal, not even by an inch? Now you know why.

These mechanisms are well known, see here for a book published in 1915 talking about such devices. So now, how is it that human error on the part of an engineer or a driver can be the sole source of fault in the Los Angeles train crash? No my friend, there is more than that at play. For example:
  • Does the signal system have forceful braking measures to deal with trains which, for whatever reason, run past a red signal?
  • If the signal system includes forceful braking measures, why didn't they work?
Don't come tell me a single person committing a single mistake (such as text messaging) or perhaps having a sudden medical problem such as a heart attack can cause human life loss in large scale mass transportation systems like trains. The bottom line is that you just don't run those systems with no tolerance to failure because the result is that a lot of people die.

But... if it is indeed the case that the train's signals are not failure safe, then the signals should be upgraded so they become driver-failure safe, and then we should ask exactly who is responsible for the gross omission and make sure they are held accountable for murder by negligence.

And by the way, this does not make me feel any better:

Tim Smith, state chairman of the Brotherhood of Locomotive Engineers and Trainmen, a union representing engineers and conductors, said issues that could factor into the crash investigation could be faulty signals along the track or engineer fatigue.

He said engineers in California are limited to 12 hours a day running a train, although that can be broken up over a stretch as long as 18 hours.

Wednesday, September 10, 2008

Interview by Club Smalltalk

The folks at Club Smalltalk just published an interview they did with me over the last few days. I'd like to thank Hernán Galante for the large amount of work he invests in the site. Go Club Smalltalk!

Anecdote from Amsterdam

I was in the taxi to Schipol Airport on the way home, and the driver had the radio turned on. The song was U2's Where The Streets Have No Name. I asked the guy if it was ok for me to whistle. He said yes. So I did. After a while, he asked me

Are you a professional?

Hah --- incredible!!!

Tuesday, September 09, 2008

Now we're beginning to talk

A while ago I wrote that I didn't see SSDs replacing common HDDs for a while. I think this drive, however, shows that the while has elapsed. With a bit more refinement and larger capacities, it seems to me that now there is a more proper replacement for the hard drive.

The next observation I'd make on Intel's design, which uses 10 parallel read/write channels with a recombination buffer, is that with the tiniest of efforts one could make a failure tolerant hardware RAID version of the drive in the same enclosure.

Or, seen from a different point of view, Intel's approach is analogous to using 10 way RAID striping within a single drive. Therefore, perhaps it's just a matter of time before 10 way RAID striping / mirroring is also available... or, maybe by combining something like AMD's Fusion (or an equivalent) with a sufficiently large memory buffer, one could have n-way RAID 5/6 in a (perhaps largish but yet compact) single enclosure.

If one could, in addition, physically separate the board holding the flash memory from the board having the controller, then replacing fried controllers while keeping the data intact would also be possible with ease.

What I have not seen yet is a comparison of reliability under long term heavy load between SSDs and HDDs. In particular, how much data can be written to an HDD before it fails? How does that compare to today's SSDs?

Assume an HDD with 10^6 hours of MTBF. How much data can it write at a conservative average of 50mb/sec? It's a staggering figure, really. At ~176gb an hour, the number is ~176 petabytes (or pebibytes, if you prefer). Note that this is independent of the capacity of the drive, as it is only bound by the MTBF.

Then, assume further that Intel has not yet improved the 10k cycles per memory cell reliability of flash memory. Therefore, a 160gb SSD drive would be able to write no more than about 1.6 petabytes before it fails. Note that in this case this number is bound by the cell reliability and the capacity of the drive, not time per se.

So... one or two orders of magnitude improvement in the reliability of flash memory, and the hard drive becomes completely obsolete. Easier said than done, I am sure. Hopefully soon.