A discussion of how to do Computer Science well, particularly writing code and architecting program solutions.
Wednesday, April 3, 2013
Architecture Abstraction Leaks
In this post, I wanted to draw your attention to a question posed on stackoverflow, Why is processing a sorted array faster than an unsorted array? In this question, we are ignoring the time to sort the array. The operation being applied to each array element is relatively simple and is conditional on the value of the element. By sorting the array, the same operations could be applied to contiguous elements. And as the same operations are being applied, the code in question was having a clear win from branch prediction.
But the branch predictor is abstracted hardware, so once again the programmer just learned about the architecture when he / she didn't intend to.
Wednesday, March 27, 2013
Transactional Memory and Intel's TSX
A programmer can delineate the transactional section with XBEGIN and XEND instructions. Within the transactional section, all reads and writes are added to a read- or a write-set accordingly. The granularity for tracking is a cache line. If another processor makes a read request to a line in the write-set or either request to a read-set, then the transaction aborts.
Transactions can be semi-nested. A transaction can only commit if the outer transaction is complete. Internally nested transactions do not commit on XEND. If any transaction in the nest aborts, then the entire transaction aborts. If |XBEGIN| equals |XEND|, then the entire transaction commits and becomes globally visible. Transactions can be explicitly aborted by the XABORT instruction, which enables the code to abort early when it can determine that the transaction will or should fail.
As I understand it, TSX is being built on top of the existing cache coherence mechanisms. Each cache line gains an additional bit to indicate if it is part of a transaction. Each memory operation is treated normally between the processor and the coherence hierarchy with several caveats. If a dirty, transactional block is evicted, then the transaction fails. If a dirty, transactional block is demand downgraded from modified to shared or invalid, then the transaction fails. In this case, a new message would indicate that the request to forward the data fails and the request should be satisfied by memory.
If the transaction commits, then the transactional bits are cleared on each cache line. And the lines operate normally according to existing cache coherence mechanisms.
Wrapping this up, TSX is an ISA extension that almost every program can take advantage of and therefore has an appeal toward conducting personal testing, just like building my own x64 machine back in 2005.
Monday, March 4, 2013
Maps and Red-Black Trees
All this is good until your type hides its implementation and you reason about it in abstract. A map is commonly misinterpreted by its implementation. A map is a key-value store, where a "key" has an associated "value". An address book can be understood as a map of name to address / phone number / etc.
Now, the map is assumed to be implemented as a hash table, to give O(1) look-up cost. However in the C++ standard library, map uses a red-black tree for O(log n). Before you go and request that the libraries be changed, understand that experimental measurements suggest that the red-black implementation wins out when n > 500, as the hash collisions dominate the idealized O(1) cost. This is the problem that Brainy attempts to solve: the programmer selects the functionality, e.g., map and Brainy selects the best implementation given your program and architecture.
So go ahead and use a map, but recognize that you may not have O(1) cost and definitely won't as n grows increasingly large.
Monday, January 28, 2013
Repost: When Beauty is Not Truth
First, beautiful code is not always correct.
Second, beautiful code, by virtue of its simplicity, is less likely to have bugs.
Third, beautiful code is more readable, which facilitates comprehension.
(Now back to preparing the lecture for today's class)
Tuesday, December 11, 2012
Repost: A New View of Exceptions
Friday, November 2, 2012
Computer Science Education: Student Conceptions
Who would be considered a Computer Scientist?
How does a student choose his or her particular focus in Computer Science?
Do misconceptions about Computer Science affect the student's education?
Students have three conceptions about what Computer Science is, which were first derived from interviews and reaffirmed through a 100 student survey. The Theory-view (8%) is that Computer Science is mostly concerned with a theoretical view of computers, where the mathematical basis and understanding is dominant (although there may then exist a related field of Software Engineering). The Programming-view (41%) is that all Computer Science is about programs, where one is either analyzing the basis of or the direct work on programming. The Broad-view (27%) is that Computer Science is a giant umbrella of disciplines where computers are involved. A final 23% of the survey responses were without clear category, which may be due to the limitation of the original interviews. All conceptions view algorithms and data structures as a vital component to Computer Science.
Students use enjoyment of classes as the dominant metric of whether they have an affinity for the area. Ironically, one of the two dominant courses in my undergraduate education (15-213) was offered at 9am, which would commonly be viewed as an unpleasant time. Since students use enjoyment as their metric, students are not particularly affected by their misconceptions about what the course(s) contains.
Clearly then, the enjoyability of a course can then affect the taking of subsequent courses. Which in future work, there may be an exploration of whether course enjoyment effects (like scheduled time) has an effect on enrollment in follow-on courses in subsequent semesters. Furthermore, many students trust the curriculum as providing an adequate preparation for practicing Computer Science, which is to say that they are prepared as long as they satisfy the requirements regardless of any attempt to have a focus in their course selection. Should the curriculum then have unpleasant courses to force students into specializing?
For myself, I am a holder of the Programming-view (perhaps based on being a paid programmer), as I view Computer Science to be centered on programs and the act of programming. The field is directed toward understanding programs and how to program well. Computer Science is informed in part through Mathematics by providing a basis for understanding of algorithms and data structures, of which programs are fundamentally composed. Many fields, like Bio-Informatics, are related and rely on Computer Science, but are not Computer Science.
Wednesday, September 26, 2012
Book Review: Coders at Work
Notable quotes follow:
"As they say, it's easier to optimize correct code than to correct optimized code." - Joshua Bloch
Several interviewees quote Tony Hoare who said "There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult."
"Types are essentially assertions about a program." - Dan Ingalls
"In those days at Bell Labs the amenities were great - you could call a telephone number and get a transcription. You know, leave a message and say you want it written up and it'll appear the next day in your inbox as sheets of paper. And [my coworker], after we talked about the file system on the blackboard for a little while, picked up the phone, called a number, and read the blackboard into the phone. It came back and it was about as close to a design document as we got except that it had homonyms that you wouldn't believe." - Ken Thompson, on the project that became Unix
"[Code is beautiful that has] a simple straightforward solution to a problem; that has some intrinsic structure and obviousness about it that isn't obvious from the problem itself." - Fran Allen
"When I say, 'You don't get credit because the program works. We're going to the next level. Working programs are a given,' they say, 'Oh.'" - Bernie Cosell