Monday, March 21, 2016

PhD Defense - Simple DRAM and Virtual Memory Abstractions to Enable Highly Efficient Memory Subsystems

Vivek Seshadri gave his PhD defense today covering: how different memory resources have different granularities of access, and therefore need different management techniques.  These techniques come out of understanding what hardware could do, without necessarily identifying common features in existing applications that would require / benefit from these techniques.

Page Overlays / Overlay-on-Write: provide the ability to assign physical addresses at sub-page granularities (call them overlays).  This reduces the cost of sparse copy-on-writes.  In effect, assign a fresh sub-page unit and copy to that location.  On every access, check the overlay table in parallel to determine whether to use the normal translation or go to the overlay location.

Gather-Scatter DRAM: provide support for only requesting a subset of cachelines.  First, shuffle the data in a cacheline so that the same subset of multiple cache lines will map to different chips in DRAM.  Second, introduce additional logic (just a few gates) in DRAM that will compute a modified address, where the default pattern (stride 1) is the normal, un-modified access.  

RowClone + BuddyDRAM: can DRAM speedup memcpy (and other bulk memory operations)?  First, by opening one row after another, the bitline will take the initial value and then write it into another row.  More complex is opening multiple rows simultaneously, which results in bit-wise operations across the three rows: final = C (A | B) | ~C (A & B).  By controlling C, bulk bitwise operations are possible.  Using this technique, the system can exceed the memory bandwidth for these operations.

DirtyBlock Index: the problem is that if the source is dirty, then it needs to be written back before the previous techniques can be used.  DBI provides a faster lookup mechanism to determine if / where are any dirty block lines.

These techniques are interesting, but as the candidate noted, they are in effect solutions in search of a problem.  And with DRAM being commodity hardware, it is difficult to envision these techniques being adopted without further work.

No comments: