Monday, June 20, 2011

Book Review: Write Great Code Volume 1 (part 4 of 4)

As I stated in part 3, I found some erroneous information in chapter 11 about NUMA.  While this book was published in 2004, I cannot excuse the mistakes given that the research dates to the 90s (c.f., The SGI Origin: A ccNUMA Highly Scalable Server - ISCA'97).  Fortunately, they are relatively small (only a couple of paragraphs); however, the topic is valuable to know and I've spent a fair amount of my time trying to educate programmers on good practices with NUMA (c.f., Topology information under Windows) and so I don't appreciate books providing bad information.

NUMA or non-uniform memory access (or architecture) describes a system that has multiple sets of main memory.  For programmers, this will usually lead to varying access time as some memory is "closer" than other memory addresses.  Fortunately, modern operating systems (Windows and Linux) will give applications "close" memory when they allocate it, which works well most of the time.

In Write Great Code, the author is apparently unaware that main memory can have varying access times and therefore writes, "the term NUMA is used to describe blocks of memory that are electronically similar to main memory but, for one reason or another, operate significantly slower than main memory."  And furthermore provides the examples of graphics card memory and flash devices.

The last chapter, 12, discusses I/O.  Of all the chapters, this one is most tied to "current" hardware and therefore is the most dated.  Here are three topics that I would revise, if there was a second edition: AGP versus PCI-E, polling versus interrupts, synchronous versus asynchronous.  Accelerated Graphics Port (AGP) was a late-90s / early-00s connection that has been phased out for the more general PCI Express (PCI-E) interface; however, PCI-E wasn't standardized until 2004 and therefore missed being included.  Second, at the highest performance levels, polling of devices can perform faster than interrupts.  For example, there is always more data available with a 10Gb/s connection, so polling always returns successfully; however, interrupts are still better at lower transfer rates (although I don't know the crossover point).  Finally, using asynchronous I/O is at the top of my list for things to know about devices.  In all, this chapter stood at 75 pages and is effectively a rehash of what can now be found on wikipedia.

Overall, most of this work was subsumed by my undergraduate Computer Science coursework (particularly the intro systems course).  It is therefore hard for me to know who the appropriate audience would be.  Therefore, I have a neutral take on it.  But this isn't the last, as volume 2 has also been checked out from the library.

No comments: