Sunday, June 26, 2016

Reliable email matters

Many of today's issues with software ultimately cause unreliable service.  Software's popularity does not seem greatly influenced by reliability, so the audience seems to tolerate the situation.  However, when unreliability becomes the norm, the resulting ecosystem is one in which nothing works as advertised.  You have effectively no recourse other than to roll out your own, become a system administrator, or put up with it.

This kind of environment directly limits what you can accomplish in life.  Take for instance email.  Although delivery was never guaranteed, at least you had some chance to track down problems and there seemed to be a general willingness to ensure correct transmission.  Today, emails simply vanish with no explanation, and you're not supposed to know what has happened.  After some debugging, the best working hypothesis for the latest occurrence is as follows:

Comcast silently refuses to deliver you emails that contain your email address.

To verify this hypothesis, I sent myself emails with just "" in the message body.  These emails did not bounce, did not show up in a junk email folder, and were not delivered.  But emails reading "", with the last 't' missing, were delivered.

That aggressive spam filtering is a necessary evil, the usual excuse, doesn't cut it in this case.  Someone replies to you, and the text says "at some point, wrote:".  Or someone comments on a forwarded email of yours that reads "From:".  These ubiquitous, well established email body patterns are being dropped without notice.

This new form of unreliability started at least a few weeks ago.  Comcast's first approach to resolve the issue was to unilaterally reset my password on a Saturday, while stating the department taking action does not work on weekends.  When resetting a password predictably didn't fix the delivery problem, Comcast's final position was for me to complain to Thunderbird, GMail, and several other ISPs / email client software makers for their evident, collective, and synchronized fault.

The side effects of unreliable software are allowed to spread unchecked in part because, in an unknowable and incomprehensible software world, naturally there is no responsibility and thus no recourse.  Hence, the above diagnosis is merely a best working hypothesis.  Occam's razor suggests the email problem is Comcast's fault.  But how do you find where the problem actually is without access, much less appropriate support?

I don't think this will get any better as long as software and derived services can be sold without any form of accountability whatsoever.  Consequently, until then, protecting yourself from unreliability is up to you.  In the case of email, that means implementing and/or managing your own email server.  But where does that road end?  Email is hardly a top reliability concern.  The go-it-alone approach does not scale.


Stefan Schmiedl said...

If Comcast were willing to actually look at this problem, you could get somebody with access to his mailserver logs to send you these two test messages and provide timestamps and message ids and Comcast's incoming mail server reply.

Assuming reasonably competent service personnel they should be able to determine if these incoming messages were actually delivered to your mailbox or not.

Of course, this means work and carries the chance of having to admit an error, so it's much easier to lay the blame elsewhere.

Andrés said...

I agree... and of course that's just the tip of the iceberg. I'm sure you can think of examples, here are a couple that come to mind. What happens when your retirement plan evaporates because someone implements a millisecond trading algorithm wrong? Or what happens when you get to your hospital's emergency room, which has just had all records made unavailable by cryptographic ransomware? And what happens to you if the hospital refuses to pay? Or how about Toyota implementing a 1kb embedded processor by putting the stack in the middle of the address space, so a stack overflow means the drive-by-wire system thinks you're pushing the gas pedal to the floor?

It's gotten so bad you can't even have a conversation over email --- but somehow I should feel good I've helped train some random algorithm to recognize palm trees to "prove I am not a robot". How does any of this actually help one's life?