Tuesday, April 16, 2013

When whitespace matters

In the process of grading student programming projects, I discovered that whitespace can matter in how C code is compiled.  Some programmers may be aware that strings can be concatenated together:

const char* t = "this string" " is one";

Because the whitespace is ignored.  Furthermore, programmers may encode strings via macros, and replace the instance with the macro:

#define A_STRING " is one"

Now, in the latest version of g++, new C++ support can treat the item following the string literal as either a macro or a user defined literal.  The compiler makes the determination based on whether there is whitespace.

const char* t = "this string"A_STRING; // compiler error
const char* t = "this string" A_STRING; // expected behavior

I like consistent use of whitespace in code for stylistic reasons, but introducing a singular dependency on whitespace is odd for C/C++, in contrast to python that is highly whitespace dependent.

Wednesday, April 3, 2013

Architecture Abstraction Leaks

As a computer architect, I like to think that the hardware abstraction is generally opaque.  If a programmer really needs performance or certain effects, then it is possible to reach through the layer, but in general the computer gives you what you want.  Sometimes, a programmer can do simple things that pull the covers back and reveal the architecture in all of its gritty detail.  One example would be following row versus column indexing when looping over large pieces of data.

In this post, I wanted to draw your attention to a question posed on stackoverflow, Why is processing a sorted array faster than an unsorted array?  In this question, we are ignoring the time to sort the array.  The operation being applied to each array element is relatively simple and is conditional on the value of the element.  By sorting the array, the same operations could be applied to contiguous elements.  And as the same operations are being applied, the code in question was having a clear win from branch prediction.

But the branch predictor is abstracted hardware, so once again the programmer just learned about the architecture when he / she didn't intend to.