Wednesday, August 29, 2018

Thesis Defense: Responsive Parallel Computation

We can consider parallelism strategies as either competitive (such as pthreads) or cooperative (such as Cilk).  Cooperative systems assume that the parallel computation is being equally distributed across the threads, such that other tasks are not prioritized.

Into this, Stefan Muller presented a Responsive Parallel Computation approach where the two models can be unified.  Or put another way, "We can extend existing cooperative language and cost models to account for competitive threading constructs, and to design and implement efficient and responsive scheduling algorithms."

Cooperative systems, such as Cilk, rely on task queues and work stealing to implement the parallelism, but when IO is involved, the common implementations use blocking system calls.  In the cooperative system, this blocking call then blocks one of the worker threads, rather than just the working task.

For operating system scheduling, there are many threads within the myriad of processes in the system that may need to run.  The OS often uses a priority associated with any thread / process to determine these scheduling decisions.  When programmers assign priorities (if they do at all), they often are trying to express a relative ordering between different tasks rather than absolute priorities.  As some systems limit priorities to a handful to as many as 100, the mapping of priority classes to numeric levels can be unclear.

Propose an extension to ML, PriML, with two new types:
t cmd[p] - command p returning t
t thread[p] - thread at priority p returning t

priority and order declarations enable defining the priority and the specifying the partial order between those priorities.  spawn[priority] {expression} to create the thread.  sync to retrieve the result.  Or poll to return an option that can be matched, so that we can select from multiple threads.  For example, the following PriML sequence spawns two threads to sort emails with different methods and selects the faster one.

t1 <- spawn[p] {do (qsort date emails)};
t2 <- spawn[p] {do (mergesort date emails)};
do (let fun choose () =
  cmd[p]
  {
    v1 <- poll t1;
    v2 <- poll t2;
    do (case (v1, v2) of
      (SOME l, _) =>
        cmd[p] {cancel t2; ret l}
      | (_, SOME l) =>
        cmd[p] {cancel t1; ret l}
      | (NONE, NONE) => choose ())
    }
  in
    choose ()
  end)

These features then allow the language to enforce the priority constraints, and avoid any priority inversion, as sync must always be to an equal or higher priority, and poll can be to any.

Returning to parallel program analysis, we represent them with DAGs; however, the DAG must be extended with priority annotations and the work / span calculations.  This analysis can explore different schedules, such that adding fairness for all thread priorities would extend the bound on time.

No comments: