Wednesday, September 17, 2014

Preparing for Academic Jobs

I went to a recent seminar about the preparation and practice of finding an academic job.  The following summarizes the answers given by the panelists, each of whom was giving his or her opinion.  The short version is that your letters of recommendation are key.  They are the summary of your skills and qualifications by your (future) peers.  The panelists are all research-oriented faculty, which may skew some of the opinions provided.  One quality resource on teaching jobs can be found here.

Most important things in a candidate:
- Publications (some in the right places)
- Letters (don't really lie)
- Fulfilling the needs of the department
- Put "top" school in middle of interview schedule, chance to work out mistakes but not be burned out
- Energized / excited about place
- In 1:1 with faculty, only discuss own research for half of time (~15min)
- Be formal (jacket, etc)
- Prep work with faculty letter writers (explain research, plans, etc)
- Ability to connect across areas (your own area will get you the interview, the other areas will get you the offer)
- Talent, passion, impact in research
- Have a set of questions for 1:1 time of "do you have any questions?"

Things to avoid:
- Wrong / bad job talk (did you target the right audience, and yet convey knowledge in subfield)
- Attitude (arrogance that job is yours, or desperation about finding a job)
- Two interviews in one week

Letters of Recommendations:
- Especially letters from externals
- Prepare a statement of contributions (what have you really done / achieved?)

Things to focus on:
- Take risks in your research
- Network and get your name out / aware of

Postdoc versus Second Tier:
- Find collaboration and mentoring in a postdoctoral position
- It depends

Non-Research:
- Still except some quality research
- Your research talk is a demonstration of teaching

Deciding on schools to apply:
- Location
- Areas of Focus

Packages:
- Find packages from previous applicants

Tuesday, September 16, 2014

Atomic Weapons in Programming

In parallel programming, most of the time the use of locks is good enough for the application.  And when it is not, then you may need to resort to atomic weapons.  While I can and have happily written my own lock implementations, its like the story of a lawyer redoing his kitchen himself.  It is not a good use of the lawyer's time unless he's enjoying it.

That said, I have had to use atomic weapons against a compiler.  The compiler happily reordered several memory operations in an unsafe way.  Using fence instructions, I was able to prevent this reordering, while not seeing fences in the resulting assembly.  I still wonder if there was some information I was not providing.

Regardless, the weapons are useful!  And I can thank the following presentation for illuminating me to the particular weapon that was needed, Atomic Weapons.  I have reviewed earlier work by Herb Sutter and he continues to garner my respect (not that he is aware), but nonetheless I suggest any low-level programmer be aware of the tools that are available, as well as the gremlins that lurk in these depths and might necessitate appropriate weaponry.

Monday, July 14, 2014

Book Review: The Practice of Programming

(contains Amazon affiliate link)
I recently found, The Practice of Programming (Addison-Wesley Professional Computing Series), sitting on the shelf at my local library.  I am generally skeptical when it comes to programming books, and particularly those from different decades, but I trusted the name "Brian Kernighan" so I checked the book out.

And I am so glad that I did.  From the first chapter that discussed style, I wanted to read more.  And the only reason to ever stop reading was to pull out a computer and put these things into practice.  I didn't even mind that it wasn't until chapter 7 that performance was discussed.  Still, I will readily acknowledge that I disagree with some of statements in the book.  Furthermore, there are some parts of the text that are clearly dated, like discussing current C / C++ standards.

I'd like to conclude with a brief code snippet from the work.  This code is part of a serializer / deserializer.  Such routines are always a pain to write and particularly if you have many different classes / structs that need them.  Thus the authors suggest using vargs and writing a single routine that can handle this for you.  Here is the unpack (i.e., deserialize) routine:

/* unpack: unpack packed items from buf, return length */
int unpack(uchar *buf, char *fmt, ...)
{
    va_list args;
    char *p;
    uchar *bp, *pc;
    ushort *ps;
    ulong *pl;

    bp = buf;
    va_start(args, fmt);
    for (p = fmt; *p != '\0'; p++) {
        switch (*p) {
        case 'c': /* char */
            pc = va_arg(args, uchar*);
            *pc = *bp++;
            break;
         case 's': /* short */
             ps = va_arg(args, ushort*);
             *ps = *bp++ << 8;
             *ps |= *bp++;
             break;
         case 'l': /* long */
             pl = va_arg(args, ulong*);
             *pl = *bp++ << 24;
             *pl |= *bp++ << 16;
             *pl |= *bp++ << 8;
             *pl |= *bp++;
         default: /* illegal type character */
             va_end(args);
             return -1;
         }
     }
     va_end(args);
     return bp - buf;
}

So now we have a little language for describing the format of the data in the buffer.  We invoke unpack with a string like "cscl" and pointers to store the char, short, char and long.  Hah!  That's it.  Anytime we add new types, we just to call the pack / unpack.

Does it matter that the variables are only sequences like "pl" or "bp"?  No.  Variable names should be meaningful and consistent.  "i" can be fine for a loop iterator.

We have given up some performance (*gasp*), but gained in the other parts that matter like readability and maintainability.  I plan on using this in my current research (but only the unpack, as my serializers are already highly optimized).  All in all, I approve of this book and may even someday require it as a textbook for students.

Thursday, April 3, 2014

The Information Technology Implications of the President's Intelligence Review Panel

Peter Swire gave a Thomas E. Noonan Distinguished Lecture, titled “The Information Technology Implications of the President's Intelligence Review Panel". An interesting talk based on his time last fall on the President's 5-person committee charged with reviewing the practices of the intelligence community, partially in response to Snowden's leaks. Many recommendations were made in their 300 page report, including the often cited statement "Section 215 is 'not essential'."

A major theme of the talk was the claim that the "half life of secrets is declining". At one time, something classified would stay that way for 25 or more years. There is now increasing probability that directly (through leaks) or indirectly (by inference in non-classified sources) a secret will be publicly disclosed. Decisions must now be made by the intelligence community in light of the fact that their actions will likely be revealed in this near future. 

Furthermore, there is a offense / defense tension to the gathering of intelligence. In the past, the discovery of a vulnerability in codes (e.g., encryption), etc would result in orders to change, orders that themselves would likely be undetected by potential foes. But how do you ensure that current systems remain secure, when most (90+%) are in the private sector. And clarify the tension where by e-commerce and dissent are weighed against intelligence gathering and military support (e.g., drones), and all dominated by cat videos. 

How does the United States resolve the tension of promoting a freedom agenda (use of Twitter, etc in undemocratic countries) and the need of surveillance against foreign and domestic foes? In the past, secrets and intelligence were the actions of nation-states. Often gathered on physically separate networks against the background of predominantly local communication. Now, the predominant threat is from individuals (i.e., terrorists) and operating in a backdrop of global communication.
Three final points:
  • Increased privacy protections for non-citizens regardless of locale (see PPD-29)
  • ACM/IETF Code of Ethics as relates to confidentiality and security
  • MLAT and the time scales of the treaty versus the internet
I take no stance beyond saying that I recognize that legitimate needs result in a tension and that I found the talk very interesting.


Tuesday, March 18, 2014

Turing Complete x86 Mov

Here is a small work that showed the power (or perhaps horror) of modern assembly languages. I've told people before that an instruction to "subtract and branch if negative" is sufficient for all programming.  But never would have I imagined that x86's mov is also sufficient.  Until you think about what Intel lets you do:
  • Set the instruction pointer
  • Compute arithmetic expressions via address calculation
  • Compare two values by using their corresponding locations
 I am skeptical about whether this could work, given aliasing.  Regardless, it is an interesting exercise and I applaud the author.  Now, I can only imagine one of these oddities legitimately arising in programming.  You have all been warned. :-)

Saturday, March 8, 2014

Conference Attendance SIGCSE 2014 - Day 3

Today the day will be in reverse. We'll start with papers and end with the invited speaker. I have met many attendees and even talked to some of them. Let's start with operating systems and programming languages. With the bonus theme of avoiding using Linux for presentations.

Teaching OS through code review. Unified grading workflow with git, the student submissions are viewed as diffs and the grading is via online code review. Most students preferred this system over past solutions and tools. The system also supported incremental reviews / checkpoints. The GradeBoard tool is built on review board and git.

Virtual graphics card in qemu for teaching device driver design. Graphics is selected such that students would clearly see the results. Providing a device through a virtual machine significantly reduced the difficulties for instructors as well as for students. Minimal time required to restore student "machines" when they break. Most students completed the project versus earlier versions based on kernel intercepts.

A programming language compiler compiler. Earlier versions of the class require teaching scheme before students could implement their interpreter / compiler. Now based on java, the tool plcc processes provided lexical and grammar files, so that students can then interface with the java classes. Plcc only supports LL1 languages. Students implement simple interpreted languages.

And then it was time to network again, i.e. the hallway session. This continues to be an interesting expense for an introvert, yet it is also the exponential networking exercise. After I know more people, then it is more likely that I find a group in which that I know someone and can meet others. I've made progress with knowing the participants in my "field". And having more inspiration for teaching is summer.

Friday, March 7, 2014

Conference Attendance SIGCSE 2014 - Day 2

Well rested, it is time for conference again! Keynote today by code.org. When teaching someone programming, you don't tell them this is overloading or this is event handling, but instead the base concept. CS is starting to count for high school graduation requirements at the state level, but districts and universities are slower to change. CSEd Week is Dec 8 - 14 this year, for another Hour of Code. The first one had impressive results (including almost 50/50 male / female ratio) and the key thing is that this hour is providing a foot in the door. So the hour is meant as a just a start and 97% of teachers rated the hour positively.

Adding parallel programming in CS2. Students are taught OpenMP pragmas as applied to for loops. Projects assigned around matrix operations and image processing. Part of the teaching is done through live coding, which is based on demoing patternlets. Students see this component as exciting and fresh. All problems are restricted to those not requiring synchronization. (see Csinparallel.org).

Board game strategy development in CS2. Instructors provide the engine, which provides the graphics and true game state. Students write a player that maintains its representation of the state and decides on a move. In my CS3? we had a similar project with reversei / othello as the game. Then for research, half of the students were assigned to develop components in the game engine and other students developed the players. Students developing players had higher enjoyment and felt they learned more, although there was little difference in grades.

I also visited several posters that were interesting.  In one, they studied why students dropped out of CS1 courses.  Only two measures were statistically significant: first, how much computer science experience a student had before taking the class, and second, how busy (total work, not just credits) the student was that semester.  Switching to active-learning had no real effect.  Gender made no difference.  Intention of majoring in computer science was not a factor.

The other poster looked at measuring the style of the code in CS1 assignments automatically.  They found that their tool was able to cluster the student submissions based on stylistic similarity and that grades for each cluster had a 90% confidence.  I'm intrigued!  Style is important and being able to emphasize style further is great.

And then I talked with other attendees for many hours, which is one of the reasons that I'm there.