Wednesday, April 3, 2013

Architecture Abstraction Leaks

As a computer architect, I like to think that the hardware abstraction is generally opaque.  If a programmer really needs performance or certain effects, then it is possible to reach through the layer, but in general the computer gives you what you want.  Sometimes, a programmer can do simple things that pull the covers back and reveal the architecture in all of its gritty detail.  One example would be following row versus column indexing when looping over large pieces of data.

In this post, I wanted to draw your attention to a question posed on stackoverflow, Why is processing a sorted array faster than an unsorted array?  In this question, we are ignoring the time to sort the array.  The operation being applied to each array element is relatively simple and is conditional on the value of the element.  By sorting the array, the same operations could be applied to contiguous elements.  And as the same operations are being applied, the code in question was having a clear win from branch prediction.

But the branch predictor is abstracted hardware, so once again the programmer just learned about the architecture when he / she didn't intend to.

No comments: