Tuesday, August 28, 2007

Hello, Cadence

Permutation work?... interesting!

Sunday, August 26, 2007

Vista comment overheard in the IRC channel

Check this out.

if you're playing music on vista, your network incoming [throughput] drops by 90%

Aha. So I want to spend on a new, ultra powerful computer to run Vista... why?

Saturday, August 25, 2007

Is this college mathematics?

Take a look: multiplication tables linked at a page mentioning courses such as "college" algebra.

You have to be so much kidding man. Multiplication tables with suggestions such as "you should memorize the gray area" and "remember 0 x anything = 0", with the official stamp of a college. College, you know? That place where people go after attending high school, which is where they go after attending grade school, which is where they learn how to multiply?

But there it is, it's reality. Shameful as this is, it's the mirror that shows us for who we are.

So... why do we even think we are prepared to discuss pretty much anything now? Seriously: what does it mean when people say "25% of 600,000 bridges carry more weight than they're designed to hold"?

You better be able to answer those questions properly whether you like mathematics or not. Or do you want to be dead now?

Gross confusion

So here I am, doing research. I go through Knuth's book, and in the particular section I am in he suggests hash(x) = ax, where the factor a has to be chosen properly so that it is within certain ranges and such that each of its bytes is not too close to any of the others or their complements.

Then he gives an example of a value of a, for a decimal computer in which the bytes can have values from 0 to 99. In "hundred"-decimal notation, it would be as shown below.

a = 61 25 42 33 71

Now, certainly, if we are working with a decimal computer, then yes... each byte is not too close to any of the others or their complements. But, in a binary computer, this looks quite differently. In hex, the value above is 33 bits long.

a = 1 6D 1A 8B 0B

So first, it doesn't even fit in 32 bits, and second, its bytes are not necessarily far away from each other or their complements. That is fine because the value of a was chosen for a decimal computer, not a binary one.

Want to cry? The value of a can be found in .h files via Google, even though it typically does not fit in 32 bit variables. Also, this choice of a has been roundly criticized for giving poor hash value distribution in binary computers. Well of course, it was never meant for a binary computer in the first place.


An update on the book

So... what does the to do list look like these days?

Chapter 6 on AGC is currently half written. Not much is being done on it at this time.

Chapter 7 regarding hash, on the other hand, is progressing nicely. I had originally estimated that Chapter 7 would be around 50 pages. Then I realized I was wrong and estimated 100 pages. It is currently 92 pages, and I would say it is no more than 50% written, probably less. Therefore, I have to increase the estimate for the chapter once more, to 200 pages. In addition, since I've run into problems, such as "there is no written information regarding this topic", I've had to research and write at the same time. Although it is a lot of fun, this slowed the pace of what would have otherwise been a simple file out. Oh well --- better get going!

Finally, the book gained an appendix. I hope the contents will be a very pleasant surprise.

The draft is currently 449 pages, and the pdf file no longer fits in a 3.5 inch floppy. There are about 1.1 computer megabytes' worth of LaTeX source code. There are no illustrations yet.

Time to write some more now.

Friday, August 24, 2007

A Pattern of Perception in action

Poor goalkeeper... there he goes, assuming as his strategy builds his objectives, that nothing will change. We live a fraction of a second in the past.

Wednesday, August 22, 2007

Fun with Intel x87 FSIN

For some arguments, asking an Intel x87 FPU to calculate the sin of a double precision floating point number, which has 53 bits in its mantissa, will result in a double precision floating point number with a mantissa that has its lower 40 bits set to zero. In other words, you get no more than 13 significant bits' worth of a result.

Check the answer to this expression for the very nice looking multiples of pi that cause this to happen:

(-1048576 to: 1048576) select: [:any | (((Double pi * any asDouble) sin mantissa: 53) bitAnd: 16rFFFFFFFFFF) = 0]

And a few fractions as well...

(-1048576 to: 1048576) select: [:any | (((Double pi / 1048576.0d * any asDouble) sin mantissa: 53) bitAnd: 16rFFFFFFFFFF) = 0]

(1 to: 1048576) select: [:any | (((Double pi / any asDouble) sin mantissa: 53) bitAnd: 16rFFFFFFFFFF) = 0]

I am very surprised, and I guess disappointed too, with this. I debugged the VM down to the FSIN instruction, and as far as I can see those are the results given by the Intel x87 FPU. The instruction FCOS seems to have the same behavior.

This does not look like it should be the case. Am I missing something here?

PS: If anybody runs these expressions in their image, would you mind letting me know the result and CPU type? Thanks!

Update: Indeed, it is a "bug" in x87 FPUs. Here is some proper explanation.

FPU questions

The new VM tests uncovered that double precision answers to trigonometric functions on SPARC cpus are not the same as those that x86 cpus give. A few least significant bits of difference wouldn't really matter that much, so I wrote a bit of code to allow checking for "almost equality" of floating point numbers.

When I went to see the tests that fail to see by how much "almost equality" should be adjusted by, however, for things such as double precision sin(pi) the difference is as large as 38 out of 53 bits of the mantissa. That is a huge delta! It feels really gross to fudge the precision of the test by a factor of 2^38.

But maybe there's something else at play here, perhaps I've been missing an obvious bit of information so far. I do not know what it is yet, but in the mean time hacking the tests to pass is definitely not right. Has anybody run into things like this before?

Incidentally, in the test above, the x86 FPU gives an answer closer to zero.

Friday, August 17, 2007

When 50% is much more than half

Check out this poll. Isn't it great to look in the mirror, even if what we see is not pretty? Also, check the graph below.

Update: the flier above came from PSE&G. They still owe me money from the last bill because they made an adjustment to my favor after I paid it. What did they do? Send me to collections, of course. Grrr!!!

Ok guys, prove me wrong

Perhaps this is a sign that I was not correct, and that we are indeed a serious country that takes care of their own people.

I doubt it. But I hope I am wrong this time.

Thursday, August 16, 2007

LASTUG next Monday!

Next Monday, I will be giving the presentation A Pattern of Perception at LASTUG in Pasadena. See you there!

Papers updated

There are new versions of two of the three Smalltalk papers I posted about recently. Go to the original post and get them!

Wednesday, August 15, 2007

The issue with investing

Let's say I invest in stock XYZ, and that because of my perception that in the future XYZ will be more valuable, I invest a sizable amount of money.

Then, let's say I become aware that XYZ is engaged in things I disagree with, but that regardless of how bad they are, the realistic observation is that they will still make money.

Do I pull the plug on my investment? Do I question XYZ's behavior in public?

And it's more than that. What about my car and my house, which are investments as well? And note I do not quite have much choice there: I must be able to afford those things. So... do I pull my part of the plug on the things that support my ability to have a roof? Do I question the system that is merciful enough to allow me a roof?

Hold on, it should be my right to have a roof! What is this charity I am supporting with my investing now?

That is the problem with having things. By virtue of the fact that you want stuff, it is way too easy to change who you are to support the status quo. The possibility of becoming a slave of your possessions, and selling your soul to stuff, is too great. And these days, we can easily see what happens when reality catches up with us.

So what do we do when we have a conflict of interest with the way in which we live?

Thanks for the shiny beads, but no thanks.

Tuesday, August 14, 2007

New papers available for download

Update: a new version of Instance Specific Behavior in Visual Smalltalk and Simulating Method Wrappers in Visual Smalltalk are now linked below. Make sure you have the latest versions!

So I got three new papers describing serious efforts done in Smalltalk. While the work was done in Visual Smalltalk, they illustrate very well how to attack development process issues using Smalltalk itself.

First, we have Simulating Method Wrappers in Visual Smalltalk and Instance Specific Behavior in Visual Smalltalk. Note that related topics were discussed at length over the IRC channel not long ago. These papers include several techniques to speed up performance. With these facilities in your system, you have a code coverage tool essentially for free, and you can also do things such as checking for stack overflow via SUnit without actually causing it (!).

Finally, we have An Efficient Difference Finder, which talks about how to use a code coverage tool made possible by the previous paper to streamline code merging. It is fascinating to see how the patterns I've seen work so well at other places are present here too. I guess there are only so many ways in which you can be productive.

The papers can be downloaded from ftp://sqrmax.homeip.net. Note there is no anonymous user, so you have to use the user name "smalltalkVideo" with the password "now". Also, be sure to use active FTP. Or, if you have trouble with the FTP server, use the HTTP links above.


Update: the files are available at the FTP server now, user "smalltalk" password "now".

Hash Analysis Tool v2.0 is released!

Finally! The Hash Analysis Tool v2.0 is now released. The changes since the last beta are as follows.

  • Published the needed prerequisite, Integer>>hashMultiply, to the public Store repository. But note that in later versions of VisualWorks, this may become an unnecessary prerequisite.
  • Fixed a small problem with chi^2 calculations.
  • Added all the original experiments that were the basis for the Hash Analysis Tool v1.x as hash functions (see the string sparse sample hash functions).

Sunday, August 12, 2007

Hash Analysis Tool v2.0 beta

The Hash Analysis Tool v2.0 has reached beta status. As a noteworthy addition, it now has word list data sets for 6 different languages built in.


Update: I just published beta1 with an added button so you can inspect only large hash buckets. I do not think there's any feature left that it is worth adding at this time. If everything goes fine, I will move from beta to release shortly.

Strange relationship

So I was checking some things in the Hash Analysis Tool, and I noticed the following.

If the hash function appears to be of reasonable quality, then the load factor for size p is roughly equal to the normalized chi squared mod p test result plus the fraction (amount of objects minus amount of hash values) over (amount of objects).

What in the... I tried to get a proof, but got nowhere. Any ideas?

Saturday, August 11, 2007

Notes on the avalanche test for hash functions

Apparently, the avalanche test for hash functions is very popular. If I understand it correctly, you take single bit deltas, apply them to a number of randomly chosen inputs via subtraction or bit xor, hash the result and you make sure that each bit of the output changes 50% of the time on average.

But it seems to me one can build functions with horrific hash qualities that satisfy the avalanche test. In particular, it is possible to devise a hash function that produces exactly four values and yet passes the avalanche test.

In addition, several hash functions known to be good fail the avalanche test. In particular, perfect hash functions are particularly susceptible to failure.

This just raises questions regarding the usefulness of the avalanche test.

So... has anybody worked on this topic? Did I just miss something here?

Friday, August 10, 2007

Number theory in VisualWorks

A few days ago, I started a large integer crunching calculation that needs to go through a huge number of iterations. With the (rather slow) laptop hibernating most of the time, guess what milestone the image just reached...

3 x 10^16 iterations

I just like Smalltalk (and VisualWorks in particular) so much for this kind of thing...

Bits of history, word of advice

Remember the mess in Argentina back in 2001, 5 presidents in two weeks? I am sure you do. But do you know what led to that?

For many years, the worthless peso had been pegged to the dollar on a 1:1 exchange rate. Free market? Yeah right, it was done by law. But in a country that mostly sells materials for others to add value on, and that buys all the rest with a considerable trade deficit as a result, somebody has to pay for all those dollars of imported items.

At first it was the sale of everything owned by the public. Because public companies are owned by, you know, us! So what was sold? Well things like an oil company. Now that is smart. And so with the utilities, the phones, etc.

After all that was sold, the external debt had still become over two times larger. Without anything else to sell, imports began to stop and the economy began to sag. Eventually, people wanted their dollars out of the banks.

Surprise! There are not enough dollars for everybody! And why would there be, when the overnight annual rate for money was anywhere between 40% and 120%?

The consequences for Argentina were nasty, to say the least.

Do you recognize this pattern somewhere?

Hash Analysis Tool v2.0 alpha

The first pre-release version of the Hash Analysis Tool v2.0 is now available at Cincom's public repository. To take a look, load bundle Hash Analysis Tool and then evaluate HashAnalysisToolUI open.


Simple problems, no solutions

Can you believe there is no closed formula for a partial sum of a row of Pascal's triangle?... grrr!!!

Thursday, August 09, 2007

Hash, book, rehash

So, a status update is in order. First, the book. The .dvi file blew past 2^20 bytes. The draft is currently 374 pages long. The hash chapter has about 80 exercises so far, and it's not even half written. Talking about that, I found some very beautiful exercises today. It looks like a 500 page book to me...

Second, I suspected as much and now I think I have a proof:

The values produced by any good quality hash function will almost certainly result in collisions when they are taken mod M.

With a bit more work, I will even be able to say how likely it is that there are k collisions modulo M. But to obtain closed formulas for that, I will need to resort to Concrete Mathematics.

Oh, the proof? Check out chapter 7 of the book! :)

Thursday, August 02, 2007

We are so clever we impress ourselves

A few days ago I saw an article on CNN that reminded me of how easy it is to become a psycopath without even noticing it.

Basically, what the article says is that since Chernobyl caused only 75 deaths, directly affected the lives of thousands, and sprinkled a good deal of radioactive pollution all over the place, then those risks are acceptable (?!) and we should save ourselves (?!) by building more nuclear reactors. Besides, the risks of the Three Gorges Dam include the death of 1,000,000 people, and we accept that in the first place.

So in other words, Mr. David Whitford, perhaps you and all your family and friends should live inside a reactor's concrete housing, right next to the control rods. Since you call that an acceptable risk, then I am sure you will not mind. That way, you will see the football size cavities in the reactor's pressure vessel or very serious cracks around control rod nozzles --- if the inspection that finds such problems does not interfere with the pursuit of profit and happens before an accident.

How in heck is it that somebody can write that the death of even one person is acceptable? Don't believe somebody could write such a thing? Here is the link.

And yesterday a bridge went down. I immediately thought those disasters do not occur overnight, and now I cannot hold that thought to myself. It turns out the bridge was known to be deficient.

The article linked above, which quotes the deficiency report from a newspaper, also says:

It's a bit ironic, as Minnesota has a good record for bridge maintenance. The US Department of Transportation rated only 3% of Minnesota's bridges as "structurally deficient", which put us as the eleventh-best state in that efforts. Rhode Island, in contrast, has 23% of its bridges rated structurally deficient, and seventeen percent of Michigan's bridges have that rating.

Are you kidding? What do you mean, ironic? The point is that anything over 0% is unacceptable!!! That should be the point, and while it may never be achieved, something ridiculous like 23% or even 3% is (at least in my opinion) tantamount to manslaughter by negligence.

I find such absurd and even idiotic assertions to be extremely disturbing. Do you think such failure rate would be acceptable for the rubber hoses that deliver brake fluid to your car brakes' calipers?

So it's not that we can't do it. Rather, one way or the other we find an excuse not to and then rationalize the irrational after something goes wrong. Do we have to remember Katrina now? Oh sure enough betting that there would be no strong hurricanes there was paying good odds. And certainly nothing should have happened with Challenger when the O-rings were known to be problematic. Did you know that those O-rings now have unmixed rubber in them, and that improper mixture of rubber is one of the most fundamental and elementary problems in rubber manufacturing? And what about Columbia, another bet that there would be no problems? And so on, and so on.

The problem is that those decisions made assisted by how a body part feels kill people. We just haven't understood that yet. Even when our own computer simulations tell us what we already know, we still look the other way.

Ah but it's too expensive to fix, right? Well, we spend much more money somewhere else already, so we do have the money when we want to. Therefore, regardless of the reason, we just do not want these problems solved.

I hope that, at least once, we do not choose a scapegoat or end up performing some sort of public exorcism or semi-religious sacrifice, like we usually do, so we can avoid our responsibilities. The blame is spread evenly between the ones who made the decisions, and the ones that knowing them to be wrong didn't do anything to question them.

Look in the mirror people. It ain't a pretty picture, but it's not going to change by watching TV.

Update #1: More Than 70,000 Bridges Rated Deficient. And an even better link from Chris Petrilli. So now, only because there was a problem, we get to see the magnitude of the issue. If there had been no bridge breakdown, we would still have our heads in the ground like ostriches pretending there is no lion nearby.

Update #2: From CNN, 25% of the 600,000 major bridges in the US is carrying more traffic than what they were designed to bear. Rhode Island's numbers are more like 50% of bridges in trouble. And the list of indictments goes on and on. This did not happen all of a sudden. This is the result of a bet that nothing will happen, just like with Katrina.

Update #3: How old was that steam pipe that blew up in New York not long ago? Close to 85 years, right? And Wikipedia states there have been 12 such explosions in the last 24 years? And nothing happens, right? A-ha...

Update #4: So far, four nuclear plants have cooling problems as a result of a massive earthquake in Japan. One of them is said to have developed problems because of the corresponding tsunami. But, of course, nobody knew that Japan experiences regular, brutal earthquakes and tsunamis, doh... here's the link to a relevant CNN article. I propose David Whitford and family relocate to the Japanese beach, right next to the
Fukushima Daiichi plant. For more awful context, check out this list of nuclear accidents in the US (the last one, which contaminated the water table, just last month).