Showing posts with label book. Show all posts
Showing posts with label book. Show all posts

Monday, August 24, 2020

The Martian, Computer Science, and College

This summer, the school asked professors if they would be interested in leading book discussions with incoming first-year students in Computer Science.  I, along with many other professors, volunteered, where each of us could select our specific title to discuss.  I proposed reading, The Martian, by Andy Weir.  What follows is not a review of the book, which I really enjoyed, but rather a summary of the discussion points from the hour we had together.

The following text contains many book spoilers.

We started the discussion with a short summary of my background and then a student asked about the Martian rover hacking.  It is in my opinion, plausible.  It depends on several assumptions, such as the rover's driver being able to be modified so easily to log the malformed network data (sent by the probe).  It would then be reasonable to send commands to the probe to broadcast the necessary data to construct an executable on the rover.  Then assuming that Mark can run it with sufficient privileges or that there is a known vulnerability allowing the executable to gain the privilege, the probe's data could be a patch.  Personally, I enjoyed the thought of using ASCII to communicate and myself and my TAs agree, "man ascii".  Besides, I carry an ASCII chart in my wallet.

We discussed how there was significant cooperation in solving the problems.  The crew worked together.  NASA had many teams working on the problems.  Internationally, China also provided assistance.  The people working on these problems were diverse.  And there were continual concerns about the crew's mental state and about Mark's.  Similarly, Computer Science students will need to learn to work with others, to work in groups with diverse backgrounds and skill sets, and know that there are many many more people that are wanting to see and willing to support them in having successful lives and taking steps whenever circumstances dictate.

Mark Watney survives in part by having a diverse education and training.  Being a fictional character, he has the right skills to survive, but this is based in reality.  Astronauts are trained in a diverse set of skills, particularly to maximize the value gained from their time in space.  They are not experts, but rather trained well to exercise the guidance of experts on Earth.  And similarly, I reinforced to the students that their studies should work to give them a broad foundation beyond Computer Science.

The final topic brought up by the students was about ethics.  First, should NASA tell the crew of the Hermes that Mark Watney was alive on Mars, when it was first determined.  Or instead, NASA would censor all communication to ensure that they were not informed that they abandoned Mark.  What is the trade-off between the truth and mission results?  Second, the Chinese scientists had to make a decision, is the life of one astronaut worth their probe?  Should they give up their long-prepared mission of great scientific value to instead make a "grocery delivery"?  How much is one life worth?  Third, when the Rich Purnell plan presented an alternative to rescuing Mark, was NASA obligated to consult the crew in evaluating this option?  And related, the crew of the Hermes decided to return to Mars (on the low chance of killing everyone plan) to save Mark Watney, and also extending their mission duration.  Also briefly discussed was that governments also have to decide how much a life is worth.  It is noted that the science that Mark can perform makes up for the cost of his rescue, which addresses this concern in story.

I think that the 20 or so students appreciated the hour we had together.  I hope to some day be able to meet and ultimately teach them in person.

Monday, August 27, 2018

Book Review: Valley of Genius

Valley of Genius is a history of Silicon Valley based on the people who made it.  Since the work was based on interviews, I expected that it would read as actual interviews, where the dialog exists between the author and the "genius".  Instead, the author was removed from the chapters and instead the entire text consisted of the different participants being quoted.  This writing style took sometime to accept.  Initially, I wanted to know exactly who each person was and their role in the narration, which given the numbers involved would significantly detract from the story being told.  Then I stopped bothering with the names, only looking for them when two (or more) speakers refer to each other.  And this aspect is the best of the book, to have the different individuals having a debate in the dialog, otherwise each chapter is just narration.  That said, the concern is that there is lost context to the quotes, as the author has explicitly stated the interviews had been spliced together.

All in all I enjoyed the book, but that only merits 3.5/5 stars.

(A free copy of the book was provided through Goodreads.)

Tuesday, July 31, 2018

Book Review: The Art of Application Performance Testing

The Art of Application Performance Testing, covers what it says.  The book starts with concepts general to any performance testing, which was interesting to me.  Most of the text focuses though on the Application part of the title.  The applications here are primarily web-based, or other client-server based setups, and not just the generic "application" referring to any program.  That said, I do not work on such applications, so the remainder of the text was of less value to me.

In testing applications, a performance analyst needs to establish a representative workload, which includes the actions to perform, and the combined load.  For example, most users logging in to their bank will view their account balance, while others might transfer money or pay a bill.  Combined these actions might represent most of the work from users.  Then for each unit of server, how many users should be able to perform a mix of those actions, which forms the load.

After establishing the workload, the analyst needs to implement the described workload, which requires a tool that generates the load (either by driving the application itself or replaying a synthetic trace of the load).  For those tools, what additional hardware is required to deploy this load?  Does the deployment take into account geographic and other user variations (so that the load generation is representative of the user base)?  Finally, what tooling and methodology exists for profiling and recording the execution of the workload for present and future analysis?

So I appreciated the content of the book and would recommend it to individuals focusing on testing of user-facing applications.

Monday, July 10, 2017

Wrote my own Calling Convention

I have been listening to the Dungeon Hacks audiobook, and it has both reminded of my past joys of playing Angband, as well as some interesting little "hacks" I did in high school.

In high school, I wrote many programs on my TI-83, often to help me in other classes, which lead to an interesting conversation:
Student: "Teacher, Brian has written programs on his calculator for this class."
Teacher: calls me forward "Brian, is this true?"
Me: "Yes."
Teacher: "Did you write them yourself?"
Me: "Yes."
Teacher: "I do not see what the problem is."

And besides writing useful programs, I also wrote games.  All in the TI-83's BASIC.  However, the TI-83 only had 24kB of space for programs and data.  And eventually I started exceeding this limit.  So I started finding interesting hacks that would reduce the space of programs, such as the trailing " of a string is not required if that string ends the line.  Each would save 1 byte, but it started to add up.

Now, the TI-BASIC on the 83 was very primitive.  Particularly it had no GOSUB instruction, only GOTO.  So you cannot write functions / subroutines and reuse code, which would both improve the design and reduce the space required.  Now I already knew the basics (no pun intended) of C, so I knew that code should be able to call other functions and get values back.  But TI-BASIC would let you call another program and then return to the calling program after the called one finished.  That's like a function call, right?

Therefore, I wrote a library program.  Variables were global, so the calling program could set specific parameters in the global variables, call the library program.  The library would use one parameter to determine which functionality to execute, update the necessary globals and then exit, thus returning.  And consequently, I had a library of common functions which sped up my development and reduced the space I needed.

And so it was only in listening to the audio book last week, did I realize that long ago I had developed a simple calling convention in high school.  And now I teach, among other things, the x86-64 Linux ABI (i.e. calling convention) to college students.

A calling convention just dictates how each register is used when calling a function.  Which ones need to be preserved, arguments, return value, etc.  And it also dictates the management of the stack.

Monday, May 1, 2017

Book Review: Multicore and GPU Programming: An Integrated Approach

I wanted to like this book, Multicore and GPU Programming: An Integrated Approach, and be able to use it in the classroom.  I teach a class where we cover OpenMP, MPI, Cilk, and Cuda.  Yet, I would not use this book and I have told the publishers this as well, to which they acknowledged my reasoning.

First, the author elects to use QtThreads for the base parallel programming approach.  He notes that he had considered using pthreads or C++11 thread support, and comments that he rejected C++11 threads for the book as the support was incomplete at the time.  Pthreads is left without comment.  That may be, but I have heard of and used those options, while QtThreads is something that I have never considered.

Second, the book serves as an admirable API reference.  For one or a set of function calls, significant space is dedicated to how that call works and illustrating examples for it.  Structured Parallel Programming also covers many APIs, yet it maintains a feel of being about parallel programming in general rather that the calls specifically.  However, that work covers different API sets so the two are not explicitly comparable.

Third, and this issue is really more for the editors rather than the author, the typesetting on each page is poor.  There is significant white space left bordering the text on each page.  Furthermore, the code wraps, and often not for being long code, but for long comments and comments on the same line as the code.  I understand that in programming these are stylistic choices; however, the impact of finding line wraps from long comments leaves the text looking unprofessional.  I must assume that the author wrote the code separately and then provided for being included into the book, but the editor failed to make allowances for typesetting.

In conclusion, I wanted to like and use the book.  Whenever I speak with the publisher, they always direct me to it, I just have to hope for something else to come along.  Use it as a reference perhaps, but I am cautious in my recommendation.

(This book was provided free by the publisher to review for possible use in the classroom.)

Monday, March 13, 2017

Book Review: Optimized C++: Proven Techniques for Heightened Performance

I have spent significant time on performance issues and have been in search of a book that can summarize the diversity of issues and techniques well.  I hoped that Optimized C++: Proven Techniques for Heightened Performance would provide some of the guidance I want and
This book is not quite it.  There is good material here, yet I found repeatedly thinking that the author was not aware of the past 10(?) years of changes to the field.  Not an issue of the book was from the early 2000s, but it was published last year.

A key step in improving the performance of programs is measuring it.  There are a variety of techniques for doing so.  Tools based on instrumentation and tools based on sampling profiling.  I find greater value to using the sampling profiling tools (for measuring performance) due to their lower overhead and ability to pinpoint where in a function this cost exists.  Yet the book's focus is limited to gprof-esque approaches.  I tell students that this approach is best with deep call trees, which may be a greater issue for C++ programming specifically.

The author is somewhat dismissive to compiler optimizations and emphasizes that his observed benefit has been particularly limited to function inlining.  There are many more optimizations, and you should care about them.  But again, I wonder if his experience of C++ has been deep call trees that could particularly benefit from inlining.

In a take it or leave it, this work also discourages the use of dynamic libraries.  Yes, they impose a performance penalty, but they also provide valuable functionality.  It all depends on your use case for whether you should statically or dynamically link your code.  Code that is reused by separate executables should be in a dynamic library, as it reduces the memory requirements when running and reduces the effort to patch and update those executables.  Components that are only used by a single executable should be statically linked, unless the components are of significant size such that decoupling can still benefit memory usage and the updating process.

The author related that replacing printf with puts to just print a string has performance advantages, due to printf being a complicated "God function".  The basic point is valid that printf has significant functionality; however, the anecdote should be taken with a grain of salt.  Current compilers will do this optimization (replace printf with puts) automatically.

While most of the work provides small examples, the final chapters on concurrency (?) and memory management do not.  The concurrency chapter reads as a reference book, as it lists the various APIs available and what each does.  It would be better for the book to assume that the readers are familiar with these calls (as the author does with many other topics) and discuss possible optimizations within this scope.

To conclude, the book is not bad, but I also cannot say it is accurate on every point.  Especially with performance, programmers are apt to make prompt design decisions based on "their experience" or "recent publications".  Measure your code's performance.  Only then can you discern which techniques will provide value.

Thursday, December 29, 2016

Book Review: The Art of Readable Code

I really enjoyed reading The Art of Readable Code.  I always enjoy reading books on style, both English writing as well as source code.  The earlier chapters were a greater highlight to me, as they focused on the basics of style and particularly were demonstrated through languages with which I am familiar.  Some of the later chapters instead were examples particular to languages that I do not use, such as illustrating style pitfalls with JavaScript.  The book also had value in showing several approaches and continuing to refactor the code to better meet style and readability.

While the topic is style, this is really more about fundamentally good practices, which may be implemented in one of several ways (e.g., where to put braces, camel case, etc) that is termed style.  Between this and the examples within the text, I want to start requiring it of students.  I want them to read and see why style actually matters.  Or maybe we will just have to wait until they experience why it matters and suffer for it.

Monday, April 13, 2015

Book Review: Structured Parallel Programming

At SIGCSE this year, I spoke with several publishers about possible textbooks.  Primarily, for ones that would work in the different classes that I might teach.  As well as those that relate to my present research.  In this case, I received a copy of Structured Parallel Programming: Patterns for Efficient Computation from the publisher to review and consider for possible future use in a course.

This work is in three parts: the basics of parallelism and performance, the common patterns in which parallelism is expressed, and example implementations of several algorithms.  The second part is the core of the work.  To show maps, reduce, scatter and gather, stencil, fork-join, and pipeline.  But before we learned those details, we would come to key quotes for all that I do:
You cannot neglect the performance of serial code, hoping to make up the difference with parallelism.
And:
[The] performance of scalar processing is important; if it is slow it can end up dominating performance.
Therefore, parallelism is a method for improving the performance of already efficient code.

With both the common patterns, as well as the example implementations, the authors generally provide the source code for each pattern and implementation using Cilk, TBB, and OpenMP.  This source is not for casual readers.  More involved implementations can stretch for several pages, as the initial implementation and then subsequent refinements are explored.  While it serves well as a reference, it may have worked better to focus on one parallelism approach for each section and therefore give further explanation to the code, especially the language features used.  And thereby retain the pattern itself rather than becoming a practitioners' reference.

The example implementations (the third part) are perhaps the least interesting for the classroom and potentially the most interesting for practitioners.  Clearly, if I was trying to write code similar to one of these problems, I would have an excellent reference and starting point.  However, that is quite rarely the case for myself and I suspect most people as well.

If I was teach a parallel programming course, I might consider using this work (although I still have other, similar textbooks to review); however, were I to do so I would be confining my teaching to the first two parts and may even to just 1 parallel programming paradigm.  Yes, I will admit that the last parallel programming course I took covered a diversity of paradigms (Cilk, vectorization, GPUs, OpenMP, MPI), yet I would have preferred to focus more on what one or two paradigms are capable of rather than just the taste of many.  Parallel programming takes a lot of work to learn and this book is one piece in that effort.

Tuesday, March 10, 2015

Book Review: Geek Sublime: The Beauty of Code, the Code of Beauty

A book of such promise is Geek Sublime: The Beauty of Code, the Code of Beauty, yet it fails.  The first third of this work followed the hopeful theme, intertwining the story of programming, its style, and the author's complex relationship with his Indian heritage, desire to be an artist (writer, etc), and his ability to make money programming.  An interesting twining that continued to encourage me to read, yet some worrisome signs developed.

The first worry was skipping a long section on how computers work.  This *is* my field of Computer Science, so I wasn't interested in reading the basics.  Yet worse was noticing some inaccuracies.  They established a certain level of understanding in Computer Science that was concerning, particularly in light of the author's non-technical background.  Livable, er readable, sure.

It was then the author's intent to show the great intellectual contributions made by Indians.  I have no dispute of this point, except that it wasn't actually fitting with the established theme of the work.  Finding that Sanskrit has a codified language of great antiquity enabled the author in this quest.  Alas, it was long pages that grew increasingly divorced with the first part of the title, "The Beauty of Code" and further focus on the later, "The Code of Beauty".  And this code deriving from Indian literary traditions.

In the end, the book concluded.  A crescendo of final claims that each read like next page would be the last, until such time as there was really no text left.  I learned something about Indian culture by reading this book, except that was not why I read it.  I did not gain in what I sought, so I cannot recommend reading it, nor care to provide a link to it.

Monday, July 14, 2014

Book Review: The Practice of Programming

(contains Amazon affiliate link)
I recently found, The Practice of Programming (Addison-Wesley Professional Computing Series), sitting on the shelf at my local library.  I am generally skeptical when it comes to programming books, and particularly those from different decades, but I trusted the name "Brian Kernighan" so I checked the book out.

And I am so glad that I did.  From the first chapter that discussed style, I wanted to read more.  And the only reason to ever stop reading was to pull out a computer and put these things into practice.  I didn't even mind that it wasn't until chapter 7 that performance was discussed.  Still, I will readily acknowledge that I disagree with some of statements in the book.  Furthermore, there are some parts of the text that are clearly dated, like discussing current C / C++ standards.

I'd like to conclude with a brief code snippet from the work.  This code is part of a serializer / deserializer.  Such routines are always a pain to write and particularly if you have many different classes / structs that need them.  Thus the authors suggest using vargs and writing a single routine that can handle this for you.  Here is the unpack (i.e., deserialize) routine:

/* unpack: unpack packed items from buf, return length */
int unpack(uchar *buf, char *fmt, ...)
{
    va_list args;
    char *p;
    uchar *bp, *pc;
    ushort *ps;
    ulong *pl;

    bp = buf;
    va_start(args, fmt);
    for (p = fmt; *p != '\0'; p++) {
        switch (*p) {
        case 'c': /* char */
            pc = va_arg(args, uchar*);
            *pc = *bp++;
            break;
         case 's': /* short */
             ps = va_arg(args, ushort*);
             *ps = *bp++ << 8;
             *ps |= *bp++;
             break;
         case 'l': /* long */
             pl = va_arg(args, ulong*);
             *pl = *bp++ << 24;
             *pl |= *bp++ << 16;
             *pl |= *bp++ << 8;
             *pl |= *bp++;
         default: /* illegal type character */
             va_end(args);
             return -1;
         }
     }
     va_end(args);
     return bp - buf;
}

So now we have a little language for describing the format of the data in the buffer.  We invoke unpack with a string like "cscl" and pointers to store the char, short, char and long.  Hah!  That's it.  Anytime we add new types, we just to call the pack / unpack.

Does it matter that the variables are only sequences like "pl" or "bp"?  No.  Variable names should be meaningful and consistent.  "i" can be fine for a loop iterator.

We have given up some performance (*gasp*), but gained in the other parts that matter like readability and maintainability.  I plan on using this in my current research (but only the unpack, as my serializers are already highly optimized).  All in all, I approve of this book and may even someday require it as a textbook for students.

Tuesday, February 4, 2014

Book Review: ARM Assembly Language: Fundamentals and Techniques

Besides reading an average of 5 research papers every week, I also read an average of one book each week.  Occasionally those books relate to computers and usually then I'll write about them here.  I had realized a couple months ago that I didn't really know anything about ARM processors, besides that they are low power.  It seemed remiss to be studying computer architecture and not know one of the modern architectures.  Thus I visited my school library and checked out a book.

This is the story of that book - ARM Assembly Language: Fundamentals and Techniques.  An interesting book that covered the basic of what I wanted to learn, but the short coming was that it had an expected environment that was different from mine.  ARM processors can be found in a greater diversity of devices than say, x86.  Yet, I am still thinking about the ARM processor as a drop-in replacement.  I look more to devices like Microsoft's Surface or a smartphone, and think about the presence of an OS, etc.

I learned particularly that the ARM instructions have bits to make them predicated.  And I realized then that conditional branches are really just predicated instructions.  If the predicate(s) are true, then take the branch.  Just another perspective on instruction sets.  Anyway, I look forward to getting a Raspberry Pi, so I can try out some of what I've learned and get a chance to also work through the assembly generated by compilers.

Monday, September 23, 2013

Book Review: Turing's Cathedral

Recently, I read the work of history, Turing's Cathedral: The Origins of the Digital Universe (Vintage), which is an interesting book that tells of the development of some of the first computers in the United States.  It's particular focus is on the founding of the Institute for Advanced Study (IAS), and then John von Neumann's time there.  Now, the von Neumann architecture is something I regularly conceptualize and use in teaching.  And it was interesting to read of how this architectural model was developed, and why.

In contrast, a significant portion of the book was instead written as a history of the Institute.  Given that the Institute provided access to the records used as a significant part of the source material, it is understandable that the author's focus would be so directed.  However, it adds to the misleading focus that this work follows.

Of perhaps greater slight is that a work titled "Turing's Cathedral" only features Alan Turing for a small part of the writing.  Instead we find greater focus placed on his work and how it fit into the research of that time.  Eventually leading von Neumann to explore the usage of electronic digital computers to solve the US military's problems.  He, like many European scientists, had left his homeland ahead of Hitler, and these scientists supported work leading to Germany's defeat.

The grand development that von Neumann introduced was making a computer, programmable. Beyond just reconfigurable, the project he lead at the IAS was programmable, the electronic device could store both data as well as codes that were instructions for what the device was to do.  Consider that for the next 50 years, programs would be constrained by having to store the instructions in memory, which was often a very limited resource.

So John von Neumann stars in a book titled for Turing, and a book that devotes a third of its pages to references.  A good, interesting work that could probably have been improved by an editor's scissors.  To trim the writing down to the core bits about computers, and set aside so much of the well researched chapters to attain a focus that is lacking.

Thursday, July 11, 2013

Book Review: Exceptional C++ Style 40 New Engineering Puzzles, ...

Generally, I do not write code in C++; however, on occasion (like when writing LLVM compiler passes), I am forced into using this language.  I also more regularly find myself grading student assignments that have used C++.  Particularly reading these assignments, I will be thankful to have read this book and better be able to express how students have violated the standards, as they have done in the past.

Were I forced to read a book directly on the C++ standards, let's just say I can think of lots of things I'd rather be doing.  But while Exceptional C++ Style: 40 New Engineering Puzzles, Programming Problems, and Solutions exposed me to more of the standards, I never felt like I was reading quotes from the standard.  Instead, I was listening to an interesting conversation about some real programming questions that just may require invoking the standards to answer.

I enjoyed chapters 20 and 21, as I appreciate the effort toward explaining how memory is allocated and structures laid out.  Help dispel the ignorance that new / malloc are how the OS provides memory.  And I then learned that new will throw an exception instead of returning NULL.  Perhaps time to rewrite some code.  Furthermore, I understand now why most C++ code uses preincrement on iterators.

It is not strictly a book on style, but instead this tome covers the style I care most about: good programming practices.  I don't care which style convention you use, so long as you use one consistently.  But for whatever style your code has, it had better be good code.

I recommend reading this book even if you do not regularly use C++.  I will note that it is dated; however, unless you are now using C++11, the text is still timely and even if you are using C++11 I doubt that everything has changed (though I did notice the discussion of the auto keyword was out of date).

Wednesday, September 26, 2012

Book Review: Coders at Work

I thought Coders at Work: Reflections on the Craft of Programming was a much more enjoyable read than the previously reviewed, Masterminds of Programming. Primarily as there wasn't a focus on finding conflict, I could instead enjoy hearing about how each person came to learn programming / computer science and their experiences with working on projects both small and large. By the end of the book, I wanted to go write code on a project, which suggests it was far more inspirational than most books I read.

Notable quotes follow:

"As they say, it's easier to optimize correct code than to correct optimized code." - Joshua Bloch

Several interviewees quote Tony Hoare who said "There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult."

"Types are essentially assertions about a program." - Dan Ingalls

"In those days at Bell Labs the amenities were great - you could call a telephone number and get a transcription. You know, leave a message and say you want it written up and it'll appear the next day in your inbox as sheets of paper. And [my coworker], after we talked about the file system on the blackboard for a little while, picked up the phone, called a number, and read the blackboard into the phone. It came back and it was about as close to a design document as we got except that it had homonyms that you wouldn't believe." - Ken Thompson, on the project that became Unix

"[Code is beautiful that has] a simple straightforward solution to a problem; that has some intrinsic structure and obviousness about it that isn't obvious from the problem itself." - Fran Allen

"When I say, 'You don't get credit because the program works. We're going to the next level. Working programs are a given,' they say, 'Oh.'" - Bernie Cosell

Monday, July 25, 2011

Book Review: Masterminds of Programming

Several years ago I read a couple of interesting interviews with language designers, so my curiously was piqued by Masterminds of Programming: Conversations with the Creators of Major Programming Languages (Theory in Practice (O'Reilly)). While the title is pretentious, the premise was interesting in that here are 17 well-used languages, so what was the designer thinking? Why did he make the choices that he did?

Alas, it was not quite to be. Interesting yes, but the discussions were sometimes more general involving all of computer science. Yet nearly every interview had memorable lines and claims; a few of which I shall reproduce here:

"C is a reasonably good language for compilers to generate, but the idea that humans beings should program in it is completely absurd." - Bertrand Meyer, Eiffel

"One of our programming languages guys proposed a competition in which people would program the same program in all three of those [languages] and we'd see .... It turned out the only thing we figured out is it all depended on where the brightest programmer went...." - Charles Geschke, PostScript

"A good programmer writes good code quickly. Good code is correct, compact and readable. 'Quickly' means hours to days."
and
"An operating system does absolutely nothing for you. As long as you had something - a subroutine called a disk driver, a subroutine called some kind of communication support, in the modern world, it doesn't do anything else." - Chuck Moore, FORTH

"I do recommend [C++] and not everybody is reluctant. In fact, I don't see much reluctance in [system software or embedded systems] beyond the natural reluctance to try something new in established organizations. Rather, I see steady and significant growth in C++ use."
and
"I have never seen a program that could be written better in C than in C++. I don't think such a program could exist." - Bjarne Stroustrup, C++

So now I go to Amazon to remove the book from the list to read, and record a score of 3 out of 5.

Monday, June 20, 2011

Book Review: Write Great Code Volume 1 (part 4 of 4)

As I stated in part 3, I found some erroneous information in chapter 11 about NUMA.  While this book was published in 2004, I cannot excuse the mistakes given that the research dates to the 90s (c.f., The SGI Origin: A ccNUMA Highly Scalable Server - ISCA'97).  Fortunately, they are relatively small (only a couple of paragraphs); however, the topic is valuable to know and I've spent a fair amount of my time trying to educate programmers on good practices with NUMA (c.f., Topology information under Windows) and so I don't appreciate books providing bad information.

NUMA or non-uniform memory access (or architecture) describes a system that has multiple sets of main memory.  For programmers, this will usually lead to varying access time as some memory is "closer" than other memory addresses.  Fortunately, modern operating systems (Windows and Linux) will give applications "close" memory when they allocate it, which works well most of the time.

In Write Great Code, the author is apparently unaware that main memory can have varying access times and therefore writes, "the term NUMA is used to describe blocks of memory that are electronically similar to main memory but, for one reason or another, operate significantly slower than main memory."  And furthermore provides the examples of graphics card memory and flash devices.

The last chapter, 12, discusses I/O.  Of all the chapters, this one is most tied to "current" hardware and therefore is the most dated.  Here are three topics that I would revise, if there was a second edition: AGP versus PCI-E, polling versus interrupts, synchronous versus asynchronous.  Accelerated Graphics Port (AGP) was a late-90s / early-00s connection that has been phased out for the more general PCI Express (PCI-E) interface; however, PCI-E wasn't standardized until 2004 and therefore missed being included.  Second, at the highest performance levels, polling of devices can perform faster than interrupts.  For example, there is always more data available with a 10Gb/s connection, so polling always returns successfully; however, interrupts are still better at lower transfer rates (although I don't know the crossover point).  Finally, using asynchronous I/O is at the top of my list for things to know about devices.  In all, this chapter stood at 75 pages and is effectively a rehash of what can now be found on wikipedia.

Overall, most of this work was subsumed by my undergraduate Computer Science coursework (particularly the intro systems course).  It is therefore hard for me to know who the appropriate audience would be.  Therefore, I have a neutral take on it.  But this isn't the last, as volume 2 has also been checked out from the library.

Thursday, June 16, 2011

Book Review: Write Great Code Volume 1 (part 3 of 4)

With Parts 1 and Parts 2 behind us, the text can race forward through character sets, present memory usage / layout (like my Pointers on Pointers series), discuss Boolean logic (which I only learned 6? times in college), and settle into CPU architecture, which is chapter 9.  This chapter would roughly serve as a syllabus for my undergraduate and graduate computer architecture classes, whereby the reader is told "Hey, processors are capable of doing this... somehow" and note the class where one can learn "this is how."

In learning about CPU architectures, one of the main points was made 30 pages into the chapter, to quote "The techniques up to this point in this chapter can be treated as if they were transparent to the programmer."  The paragraph continues by stating that programmers can achieve further improvements by knowing the architecture; however, the huge caveat is whether the changes required are "worth it" (c.f., Performance Anti-Patterns).  Nonetheless, there are two things to note.  First, branches can significantly affect processor performance.  Changing their prevalence and particular direction can have benefits.  Second, shorter instructions are usually faster, as they save cache space and therefore memory accesses.

Chapter 10 presents the other half of architecture, the instruction set (ISA).  Let's just say that I've been working off and on with the Intel instruction set for years and I had difficulty with this chapter.  At one level ISAs are simple to explain, and so this chapter did in introduction.  But modern ISAs (e.g., x86) have many complexities that make understanding difficult, as this work took whole pages to print the charts for particular opcode bits.  So I reached the end of the chapter, knowing what the author intended to achieve, but not actually knowing this knowledge.

I had hoped to conclude volume 1 in three parts, but chapter 11 (memory architecture) warrants its own post as it has wrong information.

Thursday, May 19, 2011

Book Review: Write Great Code Volume 1 (part 2 of 4)

+/- 1.F x 2^(Exp)

That's the modern floating point representation (with a few caveats).  It is stored bitwise as SignExpF.  As this representation uses a variable exponent, the point floats to different positions.  By doing so, the computer can store a much greater range of numbers than integer representations, but with a small loss of precision.  In chemistry, I first learned of "significant digits" and floating point limits the user to 5 - 10 significant decimal digits (which requires roughly 15 - 30 binary digits).  For this post and subsequent posts in this series, I'll be writing a combination of what I knew prior to reading, as well as information from the book's text.

There are four official formats for floating point numbers using 16, 32 (a C float), 64 (a C double), and 128 bits respectively.  Primarily, the different representations devote bits to the fraction, which provides increased precision.  Intel also has a 80-bit representation, which has 64 bits for the fraction thereby enabling existing integer arithmetic support when the exponents are the same.  Furthermore, ordering the bits: sign, exponent, fraction also enables integer comparisons between floats.

Given the precision limitations (even with 128-bit floats), there are some gotchas with arithmetic.  First, equality is hard to define.  Equality tests need to respect the level of error present in the floating points.  For example, equality is finding that two numbers are within this error, as follows.

bool Equal(float a, float b)
{
    bool ret = (a == b); // Don't do this

    ret = (abs(a - b) < error);  // Test like this

    return ret;
}

Another gotcha is preserving precision.  With addition and subtraction, the floats need to be converted to have the same exponent, which can result in loss of precision.  Therefore, the text recommends multiplication and division first.  This seems reasonable, and wasn't something I had heard before.

Friday, May 13, 2011

Book Review: Write Great Code Volume 1 (part 1 of 4)

What is great code?  What characteristics would you describe it having?  These questions come to mind as I am reading the first of a four volume series on writing great code.  The first work is subtitled, "Understanding the Machine."  Before I delve into what I've "learned" by reading this volume, what should one hope to gain?  What understanding of modern machines is required for writing code, especially great code?

First, I'd emphasize that great code is more about maintainability and understanding than efficiency.  Correctness over performance.  And therefore, the text is more about how understanding the machine can clean-up the code, simplify any designs, and ensure correctness.

Second, in the context of the previous post, most programs do not need to "understand" the machine.  Most applications will not be constrained by execution time, and therefore the programmer effort should be directed elsewhere.  Yet programmers reach their own insights into the application / computer interaction and modify accordingly, and usually, myself included, these insights are irrelevant for the application.

Third, there are specific aspects of modern computer architecture / machine design that is still worth knowing.  For example, I find that most programmers have limited understanding of branch prediction, cache usage, NUMA hierarchies, and superscalar processors.  (Based on my reading so far, I would also add floating point to this list).

What else should a programmer know?  What should I be looking for while I read this book?

Thursday, October 14, 2010

Book Review: Beautiful Code

I came across the book, Beautiful Code: Leading Programmers Explain How They Think (Theory in Practice (O'Reilly)), several months ago and was immediately intrigued.  Could this be a book to do so much of what I want to achieve here?  In short, no.  The editors want us to read thirty or so examples and learn something about what makes code beautiful.  In principle, I can agree with this thesis, as I have learned a lot about writing better code from reading what others have programmed.

The book provides 33 chapters, the shortest is 6 pages and the longest is over 30.  And the quality is inversely proportional to the length.  The shorter contributions are far more likely to achieve their goal of demonstrating beauty in programs, as their programs are more self evidently beautiful.  Short contributions just need fewer pages for their beauty.  Thus in reading 550 pages, one will find long sections of slow development to reach an uncertain conclusion as to the quality of a contributor's code, punctuated by shorter gems of programming.

To compound the difficulties in reading, I have yet to discern the method to the organization of the chapters.  Programmers use a wide variety of languages and the book is no different.  For example, the chapters using Fortran were skipped due in part from my lack of knowledge of the language.  Yet the final chapter used Lisp and was rather interesting, even though I have virtually no experience with it either.  But that chapter succeeds because the point is not based in the language, and changing its examples to pseudo-code would be just as workable.

To conclude, the premise is still valid.  And so I will remain hopeful for a future rewrite that discards about 20 chapters and provides some degree of transition between each.  Perhaps even confining the languages down to a handful, but just deciding this might be intractable.