Monday, July 14, 2014

Book Review: The Practice of Programming

(contains Amazon affiliate link)
I recently found, The Practice of Programming (Addison-Wesley Professional Computing Series), sitting on the shelf at my local library.  I am generally skeptical when it comes to programming books, and particularly those from different decades, but I trusted the name "Brian Kernighan" so I checked the book out.

And I am so glad that I did.  From the first chapter that discussed style, I wanted to read more.  And the only reason to ever stop reading was to pull out a computer and put these things into practice.  I didn't even mind that it wasn't until chapter 7 that performance was discussed.  Still, I will readily acknowledge that I disagree with some of statements in the book.  Furthermore, there are some parts of the text that are clearly dated, like discussing current C / C++ standards.

I'd like to conclude with a brief code snippet from the work.  This code is part of a serializer / deserializer.  Such routines are always a pain to write and particularly if you have many different classes / structs that need them.  Thus the authors suggest using vargs and writing a single routine that can handle this for you.  Here is the unpack (i.e., deserialize) routine:

/* unpack: unpack packed items from buf, return length */
int unpack(uchar *buf, char *fmt, ...)
{
    va_list args;
    char *p;
    uchar *bp, *pc;
    ushort *ps;
    ulong *pl;

    bp = buf;
    va_start(args, fmt);
    for (p = fmt; *p != '\0'; p++) {
        switch (*p) {
        case 'c': /* char */
            pc = va_arg(args, uchar*);
            *pc = *bp++;
            break;
         case 's': /* short */
             ps = va_arg(args, ushort*);
             *ps = *bp++ << 8;
             *ps |= *bp++;
             break;
         case 'l': /* long */
             pl = va_arg(args, ulong*);
             *pl = *bp++ << 24;
             *pl |= *bp++ << 16;
             *pl |= *bp++ << 8;
             *pl |= *bp++;
         default: /* illegal type character */
             va_end(args);
             return -1;
         }
     }
     va_end(args);
     return bp - buf;
}

So now we have a little language for describing the format of the data in the buffer.  We invoke unpack with a string like "cscl" and pointers to store the char, short, char and long.  Hah!  That's it.  Anytime we add new types, we just to call the pack / unpack.

Does it matter that the variables are only sequences like "pl" or "bp"?  No.  Variable names should be meaningful and consistent.  "i" can be fine for a loop iterator.

We have given up some performance (*gasp*), but gained in the other parts that matter like readability and maintainability.  I plan on using this in my current research (but only the unpack, as my serializers are already highly optimized).  All in all, I approve of this book and may even someday require it as a textbook for students.

Thursday, April 3, 2014

The Information Technology Implications of the President's Intelligence Review Panel

Peter Swire gave a Thomas E. Noonan Distinguished Lecture, titled “The Information Technology Implications of the President's Intelligence Review Panel". An interesting talk based on his time last fall on the President's 5-person committee charged with reviewing the practices of the intelligence community, partially in response to Snowden's leaks. Many recommendations were made in their 300 page report, including the often cited statement "Section 215 is 'not essential'."

A major theme of the talk was the claim that the "half life of secrets is declining". At one time, something classified would stay that way for 25 or more years. There is now increasing probability that directly (through leaks) or indirectly (by inference in non-classified sources) a secret will be publicly disclosed. Decisions must now be made by the intelligence community in light of the fact that their actions will likely be revealed in this near future. 

Furthermore, there is a offense / defense tension to the gathering of intelligence. In the past, the discovery of a vulnerability in codes (e.g., encryption), etc would result in orders to change, orders that themselves would likely be undetected by potential foes. But how do you ensure that current systems remain secure, when most (90+%) are in the private sector. And clarify the tension where by e-commerce and dissent are weighed against intelligence gathering and military support (e.g., drones), and all dominated by cat videos. 

How does the United States resolve the tension of promoting a freedom agenda (use of Twitter, etc in undemocratic countries) and the need of surveillance against foreign and domestic foes? In the past, secrets and intelligence were the actions of nation-states. Often gathered on physically separate networks against the background of predominantly local communication. Now, the predominant threat is from individuals (i.e., terrorists) and operating in a backdrop of global communication.
Three final points:
  • Increased privacy protections for non-citizens regardless of locale (see PPD-29)
  • ACM/IETF Code of Ethics as relates to confidentiality and security
  • MLAT and the time scales of the treaty versus the internet
I take no stance beyond saying that I recognize that legitimate needs result in a tension and that I found the talk very interesting.


Tuesday, March 18, 2014

Turing Complete x86 Mov

Here is a small work that showed the power (or perhaps horror) of modern assembly languages. I've told people before that an instruction to "subtract and branch if negative" is sufficient for all programming.  But never would have I imagined that x86's mov is also sufficient.  Until you think about what Intel lets you do:
  • Set the instruction pointer
  • Compute arithmetic expressions via address calculation
  • Compare two values by using their corresponding locations
 I am skeptical about whether this could work, given aliasing.  Regardless, it is an interesting exercise and I applaud the author.  Now, I can only imagine one of these oddities legitimately arising in programming.  You have all been warned. :-)

Saturday, March 8, 2014

Conference Attendance SIGCSE 2014 - Day 3

Today the day will be in reverse. We'll start with papers and end with the invited speaker. I have met many attendees and even talked to some of them. Let's start with operating systems and programming languages. With the bonus theme of avoiding using Linux for presentations.

Teaching OS through code review. Unified grading workflow with git, the student submissions are viewed as diffs and the grading is via online code review. Most students preferred this system over past solutions and tools. The system also supported incremental reviews / checkpoints. The GradeBoard tool is built on review board and git.

Virtual graphics card in qemu for teaching device driver design. Graphics is selected such that students would clearly see the results. Providing a device through a virtual machine significantly reduced the difficulties for instructors as well as for students. Minimal time required to restore student "machines" when they break. Most students completed the project versus earlier versions based on kernel intercepts.

A programming language compiler compiler. Earlier versions of the class require teaching scheme before students could implement their interpreter / compiler. Now based on java, the tool plcc processes provided lexical and grammar files, so that students can then interface with the java classes. Plcc only supports LL1 languages. Students implement simple interpreted languages.

And then it was time to network again, i.e. the hallway session. This continues to be an interesting expense for an introvert, yet it is also the exponential networking exercise. After I know more people, then it is more likely that I find a group in which that I know someone and can meet others. I've made progress with knowing the participants in my "field". And having more inspiration for teaching is summer.

Friday, March 7, 2014

Conference Attendance SIGCSE 2014 - Day 2

Well rested, it is time for conference again! Keynote today by code.org. When teaching someone programming, you don't tell them this is overloading or this is event handling, but instead the base concept. CS is starting to count for high school graduation requirements at the state level, but districts and universities are slower to change. CSEd Week is Dec 8 - 14 this year, for another Hour of Code. The first one had impressive results (including almost 50/50 male / female ratio) and the key thing is that this hour is providing a foot in the door. So the hour is meant as a just a start and 97% of teachers rated the hour positively.

Adding parallel programming in CS2. Students are taught OpenMP pragmas as applied to for loops. Projects assigned around matrix operations and image processing. Part of the teaching is done through live coding, which is based on demoing patternlets. Students see this component as exciting and fresh. All problems are restricted to those not requiring synchronization. (see Csinparallel.org).

Board game strategy development in CS2. Instructors provide the engine, which provides the graphics and true game state. Students write a player that maintains its representation of the state and decides on a move. In my CS3? we had a similar project with reversei / othello as the game. Then for research, half of the students were assigned to develop components in the game engine and other students developed the players. Students developing players had higher enjoyment and felt they learned more, although there was little difference in grades.

I also visited several posters that were interesting.  In one, they studied why students dropped out of CS1 courses.  Only two measures were statistically significant: first, how much computer science experience a student had before taking the class, and second, how busy (total work, not just credits) the student was that semester.  Switching to active-learning had no real effect.  Gender made no difference.  Intention of majoring in computer science was not a factor.

The other poster looked at measuring the style of the code in CS1 assignments automatically.  They found that their tool was able to cluster the student submissions based on stylistic similarity and that grades for each cluster had a 90% confidence.  I'm intrigued!  Style is important and being able to emphasize style further is great.

And then I talked with other attendees for many hours, which is one of the reasons that I'm there.

Thursday, March 6, 2014

Conference Attendance SIGCSE 2014 - Afternoon Day 1

Concept inventory for Operating Systems! Develop an open concept inventory for a wider collection of classes, starting with an OS course. Work toward scenario questions to avoid issues with terminology. Consider page replacement, instead phrased about textbooks on a desk. Identify the concepts that are not intuitive by which answers have a higher percentage correct.  They have a public share.

Process oriented guided inquiry learning in CS1. How pogil differs from active learning? Self managed teams with roles and they work through inquiry based activities, with the instructor as the facilitator. Maintain group composition over several weeks, while rotating roles. Split into two pairs for programming exercises. Both information retention (between CS1 and CS2) and female pass rates have improved. They noted a website carrying many developed resources.

Learning how to teach big data (at the middle school). A narrative game based environment to solve problems via pair programming. Worked first with middle school teachers to learn the CS concepts and then working with them to understand how to teach the students about big data. I'll need to read the paper to better follow this work, yet I am favorable toward pushing more CS content into earlier classes.

Assessment model for large project courses. How do you assess students on a large project when the students have different roles and focuses? Assessment variations: Formative vs summative. Teacher vs student. Group vs individual. Each project group is around 30 students. Grading criteria, oral feedback from instructor at student meetings, coaching from fellow students (code reviews, hackathons, etc), on demand artifacts, student reports (including contribution and time spent), individual teacher assessments (using a rubric), then the final feedback report for the group, and the option of interviews with individual students. And a final retrospective lead by the group. Most students are happy to have their grade based on the group performance. Few saw value of the reflective report.

A repository of novice programmer activity. Two million unique users using BlueJ every year. Blackbox collects anonymous data on the users. Data on each programming session, like compilation including result, line by line diffs of any edits. Then an interesting small analysis on the most common errors and how common the errors are over the duration of a course.

ACM exemplar course integrating fundamentals, programming languages, and software engineering. The problem is covering the increasing diversity of computing, yet reducing the credits required for a degree. This necessitates an integrated approach. The presentation followed with a description of the course, which showed the transitions between the integrated concepts. Several closed lab exercises exist to ready students for subsequent lectures.

Now it is time for the birds of a feather sessions. I am told that in the past there had been a session for students looking for a job, but there wasn't one this year. Still, I am intrigued by active learning in systems courses!

Conference Attendance SIGCSE 2014 - Morning Day 1

Here I am attending a Computer Science education conference. Started off with an interesting keynote, a break and meet a couple of faculty, and now the first paper session.

How to integrate software engineering into upper-level undergraduate courses? A project centered course, which included readings of selected research, as well as visiting local software development companies. Surveying the students before and after the class, and student confidence went down after the course, because students had learned how difficult the problems are. Yet students were more engaged into learning more about the subject and the resulting projects were of a higher quality.

Using real projects in software testing. Students, in teams, select a real world project. They develop a test plan, provide a progress report (requested by the students), and a final presentation. The instructor is both a customer and a coach. Able to work with the project developers. Target is generally low hundreds of classes in the project. Most students enjoyed the project and enrollment has increased. Students fill out a 360 survey on their teammates, and instructor intervention for outliers both positive and negative.

Student code is not throw aways. Best paper award. Prior work on software maintenance has usually relied on artificially prepared code, including lecturer added bugs. For this work, the code developed by prior seniors (11kloc, java, multithreaded) was provided to juniors in an intermediate version, who then added a feature and fixed bugs. Do students then follow proper practices? Most did, but a minority concluded, for example, that the code could not be tested. Many students observed the importance of quality design and had the experience of working on someone else's code base.