Elegant C: research

Showing posts with label research. Show all posts

Monday, October 27, 2014

Liberal Arts College Positions Mini-Seminar

In continuing my search and preparation for a faculty position, today I attended a mini-seminar by LADO faculty. (LADO - liberal arts colleges with diversity officers) Here are some points that were raised during the breakout and panel discussions:

Teaching:
- You will teach both upper-level courses, as well as "service" courses. Service is a good term to describe the low-level / introductory courses, as the faculty are rotating through this service to the department.
- Try to setup the teaching schedule so that 1 day is free solely from research.
- Continue to revise courses so they are fresh and current, but also avoid constantly creating all new courses.
- Valuable to set aside "non-office hours" times, during which the door can be shut.
- Faculty will sit in on courses and additionally interview the students, as part of composing an evaluation of your teaching.

Research:
- Recruiting undergraduates earlier for research to have time to train them, so that they will later be able to contribute.
- You can still collaborate and have broad impact through research with faculty at other more research-focused institutions.
- Grant proposals can also be keyed "RUI" (research at undergraduate institutions)
- Regular funding for sabbatical leaves, first often after the renewal of the 3-year contract. This leave is focused on research and may be held at R1 or other institutions.
- Startup package is present to cover the transition to grant-based funding.
- Research lab costs are significantly lower at these institutions, as funds are not required for grad students, post docs, etc.
- Schools are looking for faculty hires that add diversity to the research available.

Service:
- Service is a component, but is much smaller than teaching and scholarship. So be cautious about accepting invitations to committees, in terms of time commitment. The service time can provide valuable insight to the functioning of the institution, as well as possible collaboration with collegues in other departments.
- You will be chair of your department someday.

Other:
- Many liberal arts institutions are located in small towns.
- Take the time to customize the cover letter. Do you really want and care about this job?

Friday, February 21, 2014

Being Nice on the Internet

I am sitting in Catherine Grevet's thesis proposal on supporting diverse opinions online (e.g., NBC News). The intent is not to achieve some undescribed utopia, but rather establish mechanisms that encourage civil discourse. The goal is not consensus, but rather engagement.

Looking at links and replies, links are more commonly made to similar opinions as the linker. Replies do not have this distribution, but the replies are often incivil and not constructive.

Under what conditions do different opinions coexist in social media? If these links are being maintained, then it is possible for the opinions to coexist and suggest possibility that conditions could be identified. Through twitter, politically active (based on sharing white house petitions) social users were contacted about how they manage this issue. Four techniques were identified:
tuning out, long fruitless conversations, weak ties were brittle (don't want to unfriend, but...), and changing perceptions of friends. These behaviors reinforce homophily.

Research questions:

How does discouraging incivility impact the discourse between friends of different opinions in social media?
How does encouraging civility impact the discourse between friends of different opinions in social media?

Using a politeness classifier, facebook plugins will be developed to adjust the design based on the classifications received. Perhaps hiding impolite comments, or even hiding the entire post, are techniques for addressing the above question about reducing / discouraging incivility. Toward the final question, two probes could be applied: feedback on impoliteness to encourage more civil posts and highlighting polite friends.

So keep an eye out for a new plugin!

Monday, July 22, 2013

Review: Patterns for Cache Optimizations on Multi-processor Machines (ParaPLoP '10)

In order to optimize any code, know that the optimizations "depend on the exact architecture of the machine (processor, memory, etc), the exact version of the compiler, the exact version of the operating system and the particular configuration of the program that [you] are trying to optimize." At minimum, you should establish what your workload is and how its performance is measured, and then have some gauge of what its current performance is (including the distribution of measurements). All of which is covered in Patterns for cache optimizations on multi-processor machines before they are touch on any optimization. And I applaud them for doing so.

In this work, they explore three patterns of cache (mis-)use on modern processors. The first pattern is termed "loop interchange", named for its solution. In this pattern, the program does not access data with spatial locality. Instead of accessing every element in a cache line, the program has a different ordering and only touches a subset of the cache line, while later (after the line has been evicted) it accesses other subsets. In the example below, assume that N and M are both quite large (say 1+ million), so this code will likely have significant cache misses (at minimum L1 misses), while switching the "i" and "j" for loops (i.e. interchange) will considerably reduce the number of cache misses.

int X[N][M];
for (j = 0; j < M; j++)
for (i = 0; i < N; i ++)
f(X[i][j]); // Any set of operations applied to this element.

The next pattern is false sharing. Threads in a program intentionally and unintentionally share data. Data structures are written by programmers to logically group data; however, the grouping and structuring of the data is often made by the programmer and not for algorithmic need. The hardware is expecting locality from arrays and data structures. When multithreaded, the cache line is the unit by which the hardware tracks sharing of data. So if different threads write to different data in the same cache line, then hardware treats the writes as being made to the same thing, which precludes it from caching. The usual recommendation for solving this problem is to pad the data, so that the software notion (int) and hardware notion (cache line) are the same size.

int X[N];
void* thread_work(int tid)
{
for (int i = 0; i < N; i++)
    if (i % num_threads == tid)
    X[i] = do_work(X[i]);
}

This second example goes beyond the paper's scope for false sharing. Common data structures may also have different sharing patterns for each element. For example in this data structure, the following fields are written to: encoding, sum, weight_left, and weight_right. The rest are read-only. Currently the data structure uses two cache lines (as all fields are 8-bytes in size).   If the structure was rearranged so that the written fields were in one cache line and the read-only fields in the second line, then updates by any thread would only result in one cache miss rather than two. Padding may be required, but the key insight here is arranging data by sharing pattern, which is a generalization of the previous paragraph.

typedef struct _node {
    graph_value_t value, encoding;
    unsigned long long sum;
    struct _edge* fwd;
    struct _edge* back;

    // tree sorted by value
    struct _node* left;
    struct _node* right;

    // tree sorted by encoding
    struct _node* weight_left;
    struct _node* weight_right;
} node, *pnode;

The final pattern explored in the paper is data alignment. Ignoring the issue of misaligned accesses, let's look at misaligned allocations. Suppose we allocate an array of 48-byte data structures in a multithreaded program. Sometimes accessing an element is one cache miss, but sometimes it is two. The runtime system has packed the data structures together, with 4 fitting in 3 cache lines. In general, when you allocate data structures, they come with the same alignment as in the array, made to a 16-byte boundary, but this boundary is not guaranteed to be the start of a cache line. The primary solution is to use support calls that change the allocation alignment. This may waste space, but now the allocation comes using our expected number of cache lines. And by using the lines we expect, we can tailor the program to the architecture and observe the expected performance characteristics.

The patterns are three simple ones that architects and performance minded programmers have known for years. I am pleased to see them being reiterated, but the response may be like that from the developer after I changed his code per these patterns years ago, "Why can't the compiler just do that for me?!"

Monday, March 4, 2013

Maps and Red-Black Trees

When cooperating on a prior work of research, Brainy: effective selection of data structures, I learned about the cost of selecting particular data structures. I expect that every Computer Scientist knows the general cost of using standard data structures like trees and linked lists. Now, the extension that Brainy provided was to recognize that a "tree" could have different implementations and these implementations can have different costs for a given workload. I, and I expect others, learned about AVL, splay, red-black and other tree types.

All this is good until your type hides its implementation and you reason about it in abstract. A map is commonly misinterpreted by its implementation. A map is a key-value store, where a "key" has an associated "value". An address book can be understood as a map of name to address / phone number / etc.

Now, the map is assumed to be implemented as a hash table, to give O(1) look-up cost. However in the C++ standard library, map uses a red-black tree for O(log n). Before you go and request that the libraries be changed, understand that experimental measurements suggest that the red-black implementation wins out when n > 500, as the hash collisions dominate the idealized O(1) cost. This is the problem that Brainy attempts to solve: the programmer selects the functionality, e.g., map and Brainy selects the best implementation given your program and architecture.

So go ahead and use a map, but recognize that you may not have O(1) cost and definitely won't as n grows increasingly large.

Friday, November 2, 2012

Computer Science Education: Student Conceptions

I'm sitting in Michael Hewner's thesis defense on "Student Conceptions About the Field of Computer Science". And I think there are some interesting points raised that are worth reflecting on. Let's start with summary questions.

Who would be considered a Computer Scientist?

How does a student choose his or her particular focus in Computer Science?

Do misconceptions about Computer Science affect the student's education?

Students have three conceptions about what Computer Science is, which were first derived from interviews and reaffirmed through a 100 student survey. The Theory-view (8%) is that Computer Science is mostly concerned with a theoretical view of computers, where the mathematical basis and understanding is dominant (although there may then exist a related field of Software Engineering). The Programming-view (41%) is that all Computer Science is about programs, where one is either analyzing the basis of or the direct work on programming. The Broad-view (27%) is that Computer Science is a giant umbrella of disciplines where computers are involved. A final 23% of the survey responses were without clear category, which may be due to the limitation of the original interviews. All conceptions view algorithms and data structures as a vital component to Computer Science.

Students use enjoyment of classes as the dominant metric of whether they have an affinity for the area. Ironically, one of the two dominant courses in my undergraduate education (15-213) was offered at 9am, which would commonly be viewed as an unpleasant time. Since students use enjoyment as their metric, students are not particularly affected by their misconceptions about what the course(s) contains.

Clearly then, the enjoyability of a course can then affect the taking of subsequent courses. Which in future work, there may be an exploration of whether course enjoyment effects (like scheduled time) has an effect on enrollment in follow-on courses in subsequent semesters. Furthermore, many students trust the curriculum as providing an adequate preparation for practicing Computer Science, which is to say that they are prepared as long as they satisfy the requirements regardless of any attempt to have a focus in their course selection. Should the curriculum then have unpleasant courses to force students into specializing?

For myself, I am a holder of the Programming-view (perhaps based on being a paid programmer), as I view Computer Science to be centered on programs and the act of programming. The field is directed toward understanding programs and how to program well. Computer Science is informed in part through Mathematics by providing a basis for understanding of algorithms and data structures, of which programs are fundamentally composed. Many fields, like Bio-Informatics, are related and rely on Computer Science, but are not Computer Science.

Tuesday, September 27, 2011

Repost Heartbeat

I still do all the daily stuff. Read emails, commute to campus, research new ideas for conferences like POPL or PLDI, and imagine I'll someday have something that will squeak out to be posted. So when my usual routine leads me to another programming blog I read posting about POPL research etc, it seems I should copy them - Heartbeat. Blog is still alive.

Thursday, December 16, 2010

Ray Tracing Performance and Practice Part 1

As the MICRO conference came to a close, my thoughts were stimulated toward new projects and undertakings. One task in particular was developing a ray tracer as an additional workload for my research. The first cause for specifically a ray tracer is that I developed one in C many years ago that used now deprecated libraries. Secondly, prior to my last paper deadline, another professor suggested ray tracers as having many useful characteristics. So I've been porting my old code from C to C#. This series of posts will discuss some design decisions relating to the porting, additional optimizations that have been introduced, and other interesting headaches / pitfalls.

On the first order, porting the implementations was fairly straight forward. The original code had some level of object-oriented (OO) design with triangles and spheres both encapsulated within "ray objects" (see Object Oriented C Part1). But much of the implementation was a mangled mess. For example, both colors and positions were represented by float triplets. And so, some glue logic was required to convert from the ARGB color mode into the float space used by the tracer. But being object-oriented, I could setup type conversions and operator overloads that significantly reduced the inelegance of the original implementation.

Now with a reasonable working port of the original code, the question arose: how to make it perform well? The original used all sorts of tricks some useful like bounding spheres around the triangles and some not, like preallocating the intermediate space. As always, there are a variety of levels of abstraction to target for performance work. My original implementation only focused on micro optimizations like avoiding memory allocations, etc. (And performance profiles showed that the largest single contributor was sqrt, but I didn't have a "fix" for that.) However, especially within a higher-level language, many of these choices are denied and so my improvements have taken first to the algorithmic level.

Two (or more) posts will yet follow. The first on finding and handling a sufficiently interesting scene to render. And the second will be focused on reducing the computations required to render this scene.

Monday, November 1, 2010

To research the unknown

As part of my duties as a graduate student, I presented some of my research on Friday. Beyond the presentation itself, which went well, an intriguing question arose: "If something is so difficult / expensive to implement that no one has written the code, then how do you know what it is?" Indeed. This is an honest criticism, how am I going to explore performance improvements to particular problems when the problems are so computationally expensive that they aren't implemented in existing applications.

Fortunately for my research, the scope of computationally expensive is still well within the feasible realm. Furthermore, many of these expensive problems are already known, and are not so far removed from the current applications as to be independent. So in the end, perhaps it isn't a horrible road block, but rather just one more piece to *research.*

For bonus points, the discussion portion of my presentation ran about the same length as my presentation itself. And spawned further emails in the subsequent days. So I'm delighted by how thought provoking (or perhaps contentious) my research is proving to be.