Sunday, December 07, 2008

Implicit receivers, v3.0

It seems to me that what I said in my original post has taken a life of its own and is now evolving into something that I do not think I ever meant to say. So I thought I'd offer the following clarification.

What I do not like of implicit receivers, as implemented in a number of languages, is that their design is guided by a desire to have an economy of writing. This dislike, personal and subjective as it is, comes from my own experience: pushing too strongly to write less will lead to code that becomes unnecessarily hard to read as time goes by. So, over the years I've changed my style and choice of wording with reading speed in mind.

In other words, given the usual statistic is that programmers spend something like 90% of their time reading code, then I'd rather optimize that instead of the fraction of a second that I will need to type "self" every so often.

This is the whole point of what I wrote in the first place. A matter of writing code designed with some specific reading goals in mind, and nothing else.

Now, given this, what I have read and heard gives me the impression that it is perceived that what I've gone after is the whole idea of having a receiver other than self or something like that. If I understand correctly, what Newspeak uses the possibility of implicit receivers other than self for is to access behavior from an enclosing class. As far as I can see, that would be roughly equivalent to using super, but since the idea of super does not seem totally adequate for the purposes of Newspeak (or Self... what does super mean when you don't have a class hierarchy?), then we have implicit receivers so that messages can naturally flow to either self or outer.

Again, as far as I can see, this arrangement is used for the sake of modularity in Newspeak. I certainly sympathize with Gilad's take on modularity. Globals, singletons and the whole lot can be an invitation to poorly written code, and I've also seen more than my fair share of that. This was evidently clear to me back at the keynote presentation at Smalltalk Solutions 2008 when I heard Gilad's opinions first hand, so I think any perceived disagreement of goals regarding this matter should be disabused from existing.

This does not mean that I do not still think that the current implicit receiver arrangement favors economy of writing over what I'd describe as more explicit expression of intention. This may be a matter of taste, and I also hope it is clear that this may just amount to a personal opinion.

So, now the question becomes: is this assessment not reasonable? Is it unfair to state that Newspeak's implicit receiver grammar is designed with code size in mind? I think not, because Vassili's latest post points to a paper by Gilad in which the matter of implicit receiver design is explained. And as it turns out, the way in which implicit receivers are implemented has a lot to do with brevity.

Page 6 of Gilad's paper has a section entitled Unambiguous Interpretations, and in it there is a list of four choices for implicit receiver design. The first item describes the pros and cons of using Smalltalk's approach, which is to require an explicit receiver for every message send. This is said to solve the problem of ambiguity, however,

[...] Newspeak (unlike Smalltalk) is an entirely message-based language. It would be unduly burdensome to have to specify an explicit receiver for every message send in a language where everything is a message send. While mandating an explicit receiver may be a reasonable approach in some languages, it is not appropriate for Newspeak.

It is clear that this decision is at least influenced by the sake of being terse. Not that I dislike this per se, as I also try to write code that is not unnecessarily long. Nevertheless, my observation is that the use of implicit receivers (probably) makes determining the receiver of a message more difficult because now the first "word" in a "statement" has to be scanned to the end to see if it ends with $: or '::', thus allowing to decide whether it is a receiver or not.

It so happens to be that I prefer prefixes rather than suffixes in this case, and thus my concern with making the grammar more complex for the sake of writing less.

Now, of course a prefix/suffix problem also occurs in Smalltalk when one has to determine the meaning of a string of "words". Is it a unary message chain or a keyword message? However, at this point the receiver has already been read because its position is fixed in a sentence, so what is being seen past that can be assumed to be a message of some sort.

Thus, since my experience has been that attempting to aggresively optimize how programs are written so they are shorter is counterproductive in the long run, I tend not to like this as much at least at first sight.

But what is the definition of aggressively, and what constitutes too much of it? Good question. I wish I knew how to unequivocally quantify this in general, or at least for this case in particular. Alas, other than a somewhat mild suggestion to consider the consequences of terseness from the point of view of the 7 +/- 2 guideline, I do not have a good answer to offer.

Finally, I would like to make it clear that I do not have experience with Newspeak nor Self. Please point out any technical inaccuracies in the discussion above so I can fix them. Thanks in advance.

2 comments:

Vassili said...

I think you are misinterpreting the snippet you quote because you ignore the ultimate design intent of implicit receivers.

The intent is making everything a message send. No variables. Every time an expression begins with "foo", "foo" is a message send. This is Good because it hides the implementation of "foo." We don't care if foo is fetched from a variable or computed by a method. (Additionally, in Newspeak we don't care if that method is in the receiver or the outer scope). We can transparently change it whenever we want.

Now, if we make this change we end up with having to begin every expression with self (with the exception of super and literals). We can't just get anything from a variable, we have to ask self for it first. This pollutes the code with lots of redundant selves, so it's natural to make self optional because nearly everything is sent to it anyway.

So it's not about code size, it's about representation independence. As a necessary side effect, it's also about eliminating redundant code, but even that is not at all the same as a quest for terseness.

Andres said...

Vassili,

To me, one thing does not follow from the other. Receivers made implicit in a particular fashion (i.e.: choice n from k available as in Gilad's paper) does not necessarily relate to whether the use of implicit receivers is useful or not.

I think this has been really hard for me to state in a way that does not lead some people to assume I am saying something else. I do not understand how it happens, but clearly it is there. It is disappointing.

But back to the issue. For example... if the resulting syntax with explicit receivers is too verbose, one reaction could be "let's make the obvious implicit so we can just delete it". Sure, why not. The other side to that is well, what is it about the design of the language that made a whole bunch of stuff obvious most of the time? Is it a plus for the language design? Or are there other routes to address this problem that have been left unaddressed or unexplored?

Perhaps the goal is to have something that is similar to Smalltalk (grammar/syntax wise). But isn't the fact that you can have potential ambiguity a problem, or a symptom of something else? I am just curious as to what the thoughts were at the time.

Andres.