Today is the last day for the conference. I attended several more talks today and my student took 2nd in the student research competition. So it has been a good attendance and I have received some interesting and valuable feedback on my own work, as well as finding some possible collaborations for the future.
Optimistic Loop Optimization
The compiler wants to optimize loops in the program. However, C and LLVM IR have many complex characteristics that the Polyhedral model cannot represent, such as aliasing, wrapping, or out of bound accesses. Rather than just assuming that these characteristics are not present, instead, the code can be analyzed to determine which violating characteristics may be present. These assumptions are placed in the code, and can then be reduced to the set of preconditions for which the optimized loop can be executed. Should the assumptions fail, the code instead can branch to the original version. These optimizations can also be optimized (for example N < 127 implies N < 128). For SPEC200x, the assumptions fail about 2% of the time and impose 4% runtime overhead.
Software Prefetching for Indirect
Memory Accesses
What should we prefetch in software? A[x + N] is easy for hardware, A->next is hard for everyone, while A[B[x + N]] is easy for software and hard for hardware to predict. So given a loop (such as exists in NAS is) that has this indirect structure, then prefetches can be inserted that will speedup the loop. Yet, there are three key points for inserting these prefetch instructions.
- You have to prefetch both the A[B[]] and B[]. Otherwise, the prefetch will block on the B[].
- Having both prefetches requires that they are both offset from the actual access as well as each other. Too close and they are not parallel. Too far and the data is evicted before use.
- The first point raised that there is an actual load (not prefetch) of B[] and therefore needs to be bounds checked.
No comments:
Post a Comment