Elegant C: 2014

Monday, December 8, 2014

Improving Computer Science Education

Recently in the news are two articles that relate to improving Computer Science Education. It is valuable to broaden the base of students who understand the basics of computing, just as students are expected to know Chemistry or Calculus. In fact in my biased opinion, knowing the basics of computing and programming will have greater practical benefit to students; however, it should never be at the expense of a diverse education.

The White House announced that the 7 largest school districts will be including Computer Science in their curriculum. This will quickly lead to another problem of who will teach the Computer Science classes. Not me, but I am interested in teaching the teachers. I do want to see Computer Science as an actual specialty (endorsement) for Education majors.

Another aspect of broadening the base is retaining students enrolled in the major. Being part of the majority, it is difficult for me to know the challenges faced by other groups. Similarly, I know why I entered Computer Science, so I would like to understand why others have too. Why are they passionate or interested in this field that I am a part of? Here are some things minority students have to say about STEM.

Wednesday, November 26, 2014

Computer Science Diversity

I attended a top program in Computer Science, where the gender split was 60 / 40. Then I worked for five years at a major company. Therefore, my expectation is always that anyone regardless of their gender, race, etc can succeed in Computer Science. Now, recently there was a short lived Barbie book about being a computer engineer. Ignoring any semantics about Computer Science, Software Engineering, and Computer Engineering being different disciplines, the work still did not portray women in the more technical efforts. I'd rather read a collegue's remix of the work.

In a different vein of diversity, as a white male, I have been regularly been excluded from tech events because I dislike the taste of alcohol. Thus at the (especially) frequent events in industry settings where alcohol is served, I was not socializing with my colleagues, and instead would inevitably find myself back at my desk working. As a consequence, I was effectively excluded from the event. And now in academia, I find myself attending conferences, where the social events are also touted for serving alcohol. I have no issue with serving alcohol, rather it is the near exclusivity of which the drink options trend that way. Thus a recent article struck a chord in the continuing desirability of extending the options and respecting the decision (for whatever reason) to not drink alcohol.

Monday, October 27, 2014

Liberal Arts College Positions Mini-Seminar

In continuing my search and preparation for a faculty position, today I attended a mini-seminar by LADO faculty. (LADO - liberal arts colleges with diversity officers) Here are some points that were raised during the breakout and panel discussions:

Teaching:
- You will teach both upper-level courses, as well as "service" courses. Service is a good term to describe the low-level / introductory courses, as the faculty are rotating through this service to the department.
- Try to setup the teaching schedule so that 1 day is free solely from research.
- Continue to revise courses so they are fresh and current, but also avoid constantly creating all new courses.
- Valuable to set aside "non-office hours" times, during which the door can be shut.
- Faculty will sit in on courses and additionally interview the students, as part of composing an evaluation of your teaching.

Research:
- Recruiting undergraduates earlier for research to have time to train them, so that they will later be able to contribute.
- You can still collaborate and have broad impact through research with faculty at other more research-focused institutions.
- Grant proposals can also be keyed "RUI" (research at undergraduate institutions)
- Regular funding for sabbatical leaves, first often after the renewal of the 3-year contract. This leave is focused on research and may be held at R1 or other institutions.
- Startup package is present to cover the transition to grant-based funding.
- Research lab costs are significantly lower at these institutions, as funds are not required for grad students, post docs, etc.
- Schools are looking for faculty hires that add diversity to the research available.

Service:
- Service is a component, but is much smaller than teaching and scholarship. So be cautious about accepting invitations to committees, in terms of time commitment. The service time can provide valuable insight to the functioning of the institution, as well as possible collaboration with collegues in other departments.
- You will be chair of your department someday.

Other:
- Many liberal arts institutions are located in small towns.
- Take the time to customize the cover letter. Do you really want and care about this job?

Wednesday, October 15, 2014

Richard Lipton - Knuth Prize (Practice) Talk

Richard J. Lipton is the 2014 winner of the Knuth Prize. His talk today was a summary of his work, which lead to received the prize. The talk proved to be a series of short anecdotes, which are difficult to capture, but I've copied down the highlights, as best as I can.

"Do problems have labels?" For example, simulate a queue as two stacks, is this a theory problem or system problem? At the time, the qualifying exams were split by problem types, so labeling it mattered for which exam contained it. Faculty at Yale were 50/50 split on whether to mix the problem types and instead students would sit for several days of CS questions rather than a theory day, then a systems day, etc.

Finding a division, separator to a planar graph, in root time. T(n) <= C*T(n/2)
In explaining the result to Knuth, while visiting Tarjan, who responded "You've ruined my lunch." As the result destroyed the best known algorithms that were being written, at the time, in Vol. 4.

"Throw away comments are wrong" Many introductions make inaccurate statements like "non-uniform cannot imply uniform". There is the work of Karp-Lipton dealing with non-uniform circuits and the uniform nature of algorithms. The proof was later handed out on tote-bags at CCC 2010.

"Think in High Dimensions" Binary search in high dimensional space, still logarithmic in the number of elements. For example, take a planar graph and split it by the intersections, each slab is linear and can be quickly searched.

"Learn New Tools" Now, one tool is "Probabilistic method" published on June 28, 1974, which shortly thereafter was a Yale seminar. "By an Elementary Calculation" means to Erdos to use Sterling's approximation, which in one case required taking the approximation to 7 places. Before learning this method, had been asked about the problem of Extendible Hashing, and had no idea and put it out of mind. Later asked about it again, and the problem solved easily (or perhaps two days of proofs).

"Guess Right" One problem in solving problems in the community is that we are guessing wrong. "It is really hard to prove false statements." Take the problem of detecting whether a sequence of a_nb_m has n = m? Possible using a multi-pass scan with a probablistic FSM. Can do with one-way (i.e., single pass)?

"Need a Trick" Solving a problem of vector addition, with fixed counters, with adding and subtracting (where cannot subtract from 0). 1 counter is decidable, 2 counters is not. But if there is no test for whether the counter is 0. Proved it takes EXPSPACE-hard. Pair counters, so add is subtract and vice versa.

"My Favorite Two Results" - Proving that a a^-1 = 1, in long sequence (abaaaba^-1...) can be done in LOGSPACE. Do so by replacing the a, b with matrices, then modulo prime. Given the distributed law and applying in any order, prove that it always stops on any expression.

"Future" Old problems, yes. But dream of finding proofs to math problems that use CS theory tricks.

Wednesday, September 17, 2014

Preparing for Academic Jobs

I went to a recent seminar about the preparation and practice of finding an academic job. The following summarizes the answers given by the panelists, each of whom was giving his or her opinion. The short version is that your letters of recommendation are key. They are the summary of your skills and qualifications by your (future) peers. The panelists are all research-oriented faculty, which may skew some of the opinions provided. One quality resource on teaching jobs can be found here.

Most important things in a candidate:
- Publications (some in the right places)
- Letters (don't really lie)
- Fulfilling the needs of the department
- Put "top" school in middle of interview schedule, chance to work out mistakes but not be burned out
- Energized / excited about place
- In 1:1 with faculty, only discuss own research for half of time (~15min)
- Be formal (jacket, etc)
- Prep work with faculty letter writers (explain research, plans, etc)
- Ability to connect across areas (your own area will get you the interview, the other areas will get you the offer)
- Talent, passion, impact in research
- Have a set of questions for 1:1 time of "do you have any questions?"

Things to avoid:
- Wrong / bad job talk (did you target the right audience, and yet convey knowledge in subfield)
- Attitude (arrogance that job is yours, or desperation about finding a job)
- Two interviews in one week

Letters of Recommendations:
- Especially letters from externals
- Prepare a statement of contributions (what have you really done / achieved?)

Things to focus on:
- Take risks in your research
- Network and get your name out / aware of

Postdoc versus Second Tier:
- Find collaboration and mentoring in a postdoctoral position
- It depends

Non-Research:
- Still except some quality research
- Your research talk is a demonstration of teaching

Deciding on schools to apply:
- Location
- Areas of Focus

Packages:
- Find packages from previous applicants

Tuesday, September 16, 2014

Atomic Weapons in Programming

In parallel programming, most of the time the use of locks is good enough for the application. And when it is not, then you may need to resort to atomic weapons. While I can and have happily written my own lock implementations, its like the story of a lawyer redoing his kitchen himself. It is not a good use of the lawyer's time unless he's enjoying it.

That said, I have had to use atomic weapons against a compiler. The compiler happily reordered several memory operations in an unsafe way. Using fence instructions, I was able to prevent this reordering, while not seeing fences in the resulting assembly. I still wonder if there was some information I was not providing.

Regardless, the weapons are useful! And I can thank the following presentation for illuminating me to the particular weapon that was needed, Atomic Weapons. I have reviewed earlier work by Herb Sutter and he continues to garner my respect (not that he is aware), but nonetheless I suggest any low-level programmer be aware of the tools that are available, as well as the gremlins that lurk in these depths and might necessitate appropriate weaponry.

Monday, July 14, 2014

Book Review: The Practice of Programming

(contains Amazon affiliate link)
I recently found, The Practice of Programming (Addison-Wesley Professional Computing Series), sitting on the shelf at my local library. I am generally skeptical when it comes to programming books, and particularly those from different decades, but I trusted the name "Brian Kernighan" so I checked the book out.

And I am so glad that I did. From the first chapter that discussed style, I wanted to read more. And the only reason to ever stop reading was to pull out a computer and put these things into practice. I didn't even mind that it wasn't until chapter 7 that performance was discussed. Still, I will readily acknowledge that I disagree with some of statements in the book. Furthermore, there are some parts of the text that are clearly dated, like discussing current C / C++ standards.

I'd like to conclude with a brief code snippet from the work. This code is part of a serializer / deserializer. Such routines are always a pain to write and particularly if you have many different classes / structs that need them. Thus the authors suggest using vargs and writing a single routine that can handle this for you. Here is the unpack (i.e., deserialize) routine:

/* unpack: unpack packed items from buf, return length */
int unpack(uchar *buf, char *fmt, ...)
{
    va_list args;
    char *p;
    uchar *bp, *pc;
    ushort *ps;
    ulong *pl;

    bp = buf;
    va_start(args, fmt);
    for (p = fmt; *p != '\0'; p++) {
        switch (*p) {
        case 'c': /* char */
            pc = va_arg(args, uchar*);
            *pc = *bp++;
            break;
         case 's': /* short */
             ps = va_arg(args, ushort*);
             *ps = *bp++ << 8;
             *ps |= *bp++;
             break;
         case 'l': /* long */
             pl = va_arg(args, ulong*);
             *pl = *bp++ << 24;
             *pl |= *bp++ << 16;
             *pl |= *bp++ << 8;
             *pl |= *bp++;
         default: /* illegal type character */
             va_end(args);
             return -1;
         }
     }
     va_end(args);
     return bp - buf;
}

So now we have a little language for describing the format of the data in the buffer. We invoke unpack with a string like "cscl" and pointers to store the char, short, char and long. Hah! That's it. Anytime we add new types, we just to call the pack / unpack.

Does it matter that the variables are only sequences like "pl" or "bp"? No. Variable names should be meaningful and consistent. "i" can be fine for a loop iterator.

We have given up some performance (*gasp*), but gained in the other parts that matter like readability and maintainability. I plan on using this in my current research (but only the unpack, as my serializers are already highly optimized). All in all, I approve of this book and may even someday require it as a textbook for students.

Thursday, April 3, 2014

The Information Technology Implications of the President's Intelligence Review Panel

Peter Swire gave a Thomas E. Noonan Distinguished Lecture, titled “The Information Technology Implications of the President's Intelligence Review Panel". An interesting talk based on his time last fall on the President's 5-person committee charged with reviewing the practices of the intelligence community, partially in response to Snowden's leaks. Many recommendations were made in their 300 page report, including the often cited statement "Section 215 is 'not essential'."

A major theme of the talk was the claim that the "half life of secrets is declining". At one time, something classified would stay that way for 25 or more years. There is now increasing probability that directly (through leaks) or indirectly (by inference in non-classified sources) a secret will be publicly disclosed. Decisions must now be made by the intelligence community in light of the fact that their actions will likely be revealed in this near future.

Furthermore, there is a offense / defense tension to the gathering of intelligence. In the past, the discovery of a vulnerability in codes (e.g., encryption), etc would result in orders to change, orders that themselves would likely be undetected by potential foes. But how do you ensure that current systems remain secure, when most (90+%) are in the private sector. And clarify the tension where by e-commerce and dissent are weighed against intelligence gathering and military support (e.g., drones), and all dominated by cat videos.

How does the United States resolve the tension of promoting a freedom agenda (use of Twitter, etc in undemocratic countries) and the need of surveillance against foreign and domestic foes? In the past, secrets and intelligence were the actions of nation-states. Often gathered on physically separate networks against the background of predominantly local communication. Now, the predominant threat is from individuals (i.e., terrorists) and operating in a backdrop of global communication.
Three final points:

Increased privacy protections for non-citizens regardless of locale (see PPD-29)
ACM/IETF Code of Ethics as relates to confidentiality and security
MLAT and the time scales of the treaty versus the internet

I take no stance beyond saying that I recognize that legitimate needs result in a tension and that I found the talk very interesting.

Tuesday, March 18, 2014

Turing Complete x86 Mov

Here is a small work that showed the power (or perhaps horror) of modern assembly languages. I've told people before that an instruction to "subtract and branch if negative" is sufficient for all programming. But never would have I imagined that x86's mov is also sufficient. Until you think about what Intel lets you do:

Set the instruction pointer
Compute arithmetic expressions via address calculation
Compare two values by using their corresponding locations

I am skeptical about whether this could work, given aliasing. Regardless, it is an interesting exercise and I applaud the author. Now, I can only imagine one of these oddities legitimately arising in programming. You have all been warned. :-)

Saturday, March 8, 2014

Conference Attendance SIGCSE 2014 - Day 3

Today the day will be in reverse. We'll start with papers and end with the invited speaker. I have met many attendees and even talked to some of them. Let's start with operating systems and programming languages. With the bonus theme of avoiding using Linux for presentations.

Teaching OS through code review. Unified grading workflow with git, the student submissions are viewed as diffs and the grading is via online code review. Most students preferred this system over past solutions and tools. The system also supported incremental reviews / checkpoints. The GradeBoard tool is built on review board and git.

Virtual graphics card in qemu for teaching device driver design. Graphics is selected such that students would clearly see the results. Providing a device through a virtual machine significantly reduced the difficulties for instructors as well as for students. Minimal time required to restore student "machines" when they break. Most students completed the project versus earlier versions based on kernel intercepts.

A programming language compiler compiler. Earlier versions of the class require teaching scheme before students could implement their interpreter / compiler. Now based on java, the tool plcc processes provided lexical and grammar files, so that students can then interface with the java classes. Plcc only supports LL1 languages. Students implement simple interpreted languages.

And then it was time to network again, i.e. the hallway session. This continues to be an interesting expense for an introvert, yet it is also the exponential networking exercise. After I know more people, then it is more likely that I find a group in which that I know someone and can meet others. I've made progress with knowing the participants in my "field". And having more inspiration for teaching is summer.

Friday, March 7, 2014

Conference Attendance SIGCSE 2014 - Day 2

Well rested, it is time for conference again! Keynote today by code.org. When teaching someone programming, you don't tell them this is overloading or this is event handling, but instead the base concept. CS is starting to count for high school graduation requirements at the state level, but districts and universities are slower to change. CSEd Week is Dec 8 - 14 this year, for another Hour of Code. The first one had impressive results (including almost 50/50 male / female ratio) and the key thing is that this hour is providing a foot in the door. So the hour is meant as a just a start and 97% of teachers rated the hour positively.

Adding parallel programming in CS2. Students are taught OpenMP pragmas as applied to for loops. Projects assigned around matrix operations and image processing. Part of the teaching is done through live coding, which is based on demoing patternlets. Students see this component as exciting and fresh. All problems are restricted to those not requiring synchronization. (see Csinparallel.org).

Board game strategy development in CS2. Instructors provide the engine, which provides the graphics and true game state. Students write a player that maintains its representation of the state and decides on a move. In my CS3? we had a similar project with reversei / othello as the game. Then for research, half of the students were assigned to develop components in the game engine and other students developed the players. Students developing players had higher enjoyment and felt they learned more, although there was little difference in grades.

I also visited several posters that were interesting. In one, they studied why students dropped out of CS1 courses. Only two measures were statistically significant: first, how much computer science experience a student had before taking the class, and second, how busy (total work, not just credits) the student was that semester. Switching to active-learning had no real effect. Gender made no difference. Intention of majoring in computer science was not a factor.

The other poster looked at measuring the style of the code in CS1 assignments automatically. They found that their tool was able to cluster the student submissions based on stylistic similarity and that grades for each cluster had a 90% confidence. I'm intrigued! Style is important and being able to emphasize style further is great.

And then I talked with other attendees for many hours, which is one of the reasons that I'm there.

Thursday, March 6, 2014

Conference Attendance SIGCSE 2014 - Afternoon Day 1

Concept inventory for Operating Systems! Develop an open concept inventory for a wider collection of classes, starting with an OS course. Work toward scenario questions to avoid issues with terminology. Consider page replacement, instead phrased about textbooks on a desk. Identify the concepts that are not intuitive by which answers have a higher percentage correct. They have a public share.

Process oriented guided inquiry learning in CS1. How pogil differs from active learning? Self managed teams with roles and they work through inquiry based activities, with the instructor as the facilitator. Maintain group composition over several weeks, while rotating roles. Split into two pairs for programming exercises. Both information retention (between CS1 and CS2) and female pass rates have improved. They noted a website carrying many developed resources.

Learning how to teach big data (at the middle school). A narrative game based environment to solve problems via pair programming. Worked first with middle school teachers to learn the CS concepts and then working with them to understand how to teach the students about big data. I'll need to read the paper to better follow this work, yet I am favorable toward pushing more CS content into earlier classes.

Assessment model for large project courses. How do you assess students on a large project when the students have different roles and focuses? Assessment variations: Formative vs summative. Teacher vs student. Group vs individual. Each project group is around 30 students. Grading criteria, oral feedback from instructor at student meetings, coaching from fellow students (code reviews, hackathons, etc), on demand artifacts, student reports (including contribution and time spent), individual teacher assessments (using a rubric), then the final feedback report for the group, and the option of interviews with individual students. And a final retrospective lead by the group. Most students are happy to have their grade based on the group performance. Few saw value of the reflective report.

A repository of novice programmer activity. Two million unique users using BlueJ every year. Blackbox collects anonymous data on the users. Data on each programming session, like compilation including result, line by line diffs of any edits. Then an interesting small analysis on the most common errors and how common the errors are over the duration of a course.

ACM exemplar course integrating fundamentals, programming languages, and software engineering. The problem is covering the increasing diversity of computing, yet reducing the credits required for a degree. This necessitates an integrated approach. The presentation followed with a description of the course, which showed the transitions between the integrated concepts. Several closed lab exercises exist to ready students for subsequent lectures.

Now it is time for the birds of a feather sessions. I am told that in the past there had been a session for students looking for a job, but there wasn't one this year. Still, I am intrigued by active learning in systems courses!

Conference Attendance SIGCSE 2014 - Morning Day 1

Here I am attending a Computer Science education conference. Started off with an interesting keynote, a break and meet a couple of faculty, and now the first paper session.

How to integrate software engineering into upper-level undergraduate courses? A project centered course, which included readings of selected research, as well as visiting local software development companies. Surveying the students before and after the class, and student confidence went down after the course, because students had learned how difficult the problems are. Yet students were more engaged into learning more about the subject and the resulting projects were of a higher quality.

Using real projects in software testing. Students, in teams, select a real world project. They develop a test plan, provide a progress report (requested by the students), and a final presentation. The instructor is both a customer and a coach. Able to work with the project developers. Target is generally low hundreds of classes in the project. Most students enjoyed the project and enrollment has increased. Students fill out a 360 survey on their teammates, and instructor intervention for outliers both positive and negative.

Student code is not throw aways. Best paper award. Prior work on software maintenance has usually relied on artificially prepared code, including lecturer added bugs. For this work, the code developed by prior seniors (11kloc, java, multithreaded) was provided to juniors in an intermediate version, who then added a feature and fixed bugs. Do students then follow proper practices? Most did, but a minority concluded, for example, that the code could not be tested. Many students observed the importance of quality design and had the experience of working on someone else's code base.

Monday, March 3, 2014

Ongoing Goto Wars

Recently, Apple announced a security fix for their SSL implementation. I would have taken little note here, except WIRED's summary indicated the cause being the use of goto. Ah, goto, how useful you are. How abused you are.

In practice, I would also have used goto, if I was writing a series of conditionals where I only have a single error block. The code could be restructured to link all of the conditionals into a single clause, yet it also tries to retain the specific error code for the check that failed. I would really dislike to see the following:

if ((e = foo) != 0 && (e = bar) != 0 && ...) goto fail;

So what then? Do we call upon the software engineering folk to better analyze programs and find these mistakes? Or write style guidelines that require braces for every if, and ban every goto? And then the employee's manager is required to sign off on every exception. These are some of the many proposals floated for addressing this, and often made by programmers touting the superiority of their proposal and how this would never happen to them.

Leave me to shrug. I'll use my goto, sparingly. And I'll continue to try to impress on students that they and all of us programmers will make mistakes so write elegant code to help find these bugs. Until then, if anyone has a good construction for the error code that doesn't require goto, let me know.

Friday, February 21, 2014

Being Nice on the Internet

I am sitting in Catherine Grevet's thesis proposal on supporting diverse opinions online (e.g., NBC News). The intent is not to achieve some undescribed utopia, but rather establish mechanisms that encourage civil discourse. The goal is not consensus, but rather engagement.

Looking at links and replies, links are more commonly made to similar opinions as the linker. Replies do not have this distribution, but the replies are often incivil and not constructive.

Under what conditions do different opinions coexist in social media? If these links are being maintained, then it is possible for the opinions to coexist and suggest possibility that conditions could be identified. Through twitter, politically active (based on sharing white house petitions) social users were contacted about how they manage this issue. Four techniques were identified:
tuning out, long fruitless conversations, weak ties were brittle (don't want to unfriend, but...), and changing perceptions of friends. These behaviors reinforce homophily.

Research questions:

How does discouraging incivility impact the discourse between friends of different opinions in social media?
How does encouraging civility impact the discourse between friends of different opinions in social media?

Using a politeness classifier, facebook plugins will be developed to adjust the design based on the classifications received. Perhaps hiding impolite comments, or even hiding the entire post, are techniques for addressing the above question about reducing / discouraging incivility. Toward the final question, two probes could be applied: feedback on impoliteness to encourage more civil posts and highlighting polite friends.

So keep an eye out for a new plugin!

Tuesday, February 4, 2014

Book Review: ARM Assembly Language: Fundamentals and Techniques

Besides reading an average of 5 research papers every week, I also read an average of one book each week. Occasionally those books relate to computers and usually then I'll write about them here. I had realized a couple months ago that I didn't really know anything about ARM processors, besides that they are low power. It seemed remiss to be studying computer architecture and not know one of the modern architectures. Thus I visited my school library and checked out a book.

This is the story of that book - ARM Assembly Language: Fundamentals and Techniques. An interesting book that covered the basic of what I wanted to learn, but the short coming was that it had an expected environment that was different from mine. ARM processors can be found in a greater diversity of devices than say, x86. Yet, I am still thinking about the ARM processor as a drop-in replacement. I look more to devices like Microsoft's Surface or a smartphone, and think about the presence of an OS, etc.

I learned particularly that the ARM instructions have bits to make them predicated. And I realized then that conditional branches are really just predicated instructions. If the predicate(s) are true, then take the branch. Just another perspective on instruction sets. Anyway, I look forward to getting a Raspberry Pi, so I can try out some of what I've learned and get a chance to also work through the assembly generated by compilers.