Thursday, May 19, 2011

Book Review: Write Great Code Volume 1 (part 2 of 4)

+/- 1.F x 2^(Exp)

That's the modern floating point representation (with a few caveats).  It is stored bitwise as SignExpF.  As this representation uses a variable exponent, the point floats to different positions.  By doing so, the computer can store a much greater range of numbers than integer representations, but with a small loss of precision.  In chemistry, I first learned of "significant digits" and floating point limits the user to 5 - 10 significant decimal digits (which requires roughly 15 - 30 binary digits).  For this post and subsequent posts in this series, I'll be writing a combination of what I knew prior to reading, as well as information from the book's text.

There are four official formats for floating point numbers using 16, 32 (a C float), 64 (a C double), and 128 bits respectively.  Primarily, the different representations devote bits to the fraction, which provides increased precision.  Intel also has a 80-bit representation, which has 64 bits for the fraction thereby enabling existing integer arithmetic support when the exponents are the same.  Furthermore, ordering the bits: sign, exponent, fraction also enables integer comparisons between floats.

Given the precision limitations (even with 128-bit floats), there are some gotchas with arithmetic.  First, equality is hard to define.  Equality tests need to respect the level of error present in the floating points.  For example, equality is finding that two numbers are within this error, as follows.

bool Equal(float a, float b)
{
    bool ret = (a == b); // Don't do this

    ret = (abs(a - b) < error);  // Test like this

    return ret;
}

Another gotcha is preserving precision.  With addition and subtraction, the floats need to be converted to have the same exponent, which can result in loss of precision.  Therefore, the text recommends multiplication and division first.  This seems reasonable, and wasn't something I had heard before.

Friday, May 13, 2011

Book Review: Write Great Code Volume 1 (part 1 of 4)

What is great code?  What characteristics would you describe it having?  These questions come to mind as I am reading the first of a four volume series on writing great code.  The first work is subtitled, "Understanding the Machine."  Before I delve into what I've "learned" by reading this volume, what should one hope to gain?  What understanding of modern machines is required for writing code, especially great code?

First, I'd emphasize that great code is more about maintainability and understanding than efficiency.  Correctness over performance.  And therefore, the text is more about how understanding the machine can clean-up the code, simplify any designs, and ensure correctness.

Second, in the context of the previous post, most programs do not need to "understand" the machine.  Most applications will not be constrained by execution time, and therefore the programmer effort should be directed elsewhere.  Yet programmers reach their own insights into the application / computer interaction and modify accordingly, and usually, myself included, these insights are irrelevant for the application.

Third, there are specific aspects of modern computer architecture / machine design that is still worth knowing.  For example, I find that most programmers have limited understanding of branch prediction, cache usage, NUMA hierarchies, and superscalar processors.  (Based on my reading so far, I would also add floating point to this list).

What else should a programmer know?  What should I be looking for while I read this book?