Thursday, December 16, 2010

Ray Tracing Performance and Practice Part 1

As the MICRO conference came to a close, my thoughts were stimulated toward new projects and undertakings.  One task in particular was developing a ray tracer as an additional workload for my research.  The first cause for specifically a ray tracer is that I developed one in C many years ago that used now deprecated libraries.  Secondly, prior to my last paper deadline, another professor suggested ray tracers as having many useful characteristics.  So I've been porting my old code from C to C#.  This series of posts will discuss some design decisions relating to the porting, additional optimizations that have been introduced, and other interesting headaches / pitfalls.

On the first order, porting the implementations was fairly straight forward.  The original code had some level of object-oriented (OO) design with triangles and spheres both encapsulated within "ray objects" (see Object Oriented C Part1).  But much of the implementation was a mangled mess.  For example, both colors and positions were represented by float triplets.  And so, some glue logic was required to convert from the ARGB color mode into the float space used by the tracer.  But being object-oriented, I could setup type conversions and operator overloads that significantly reduced the inelegance of the original implementation.

Now with a reasonable working port of the original code, the question arose: how to make it perform well?  The original used all sorts of tricks some useful like bounding spheres around the triangles and some not, like preallocating the intermediate space.  As always, there are a variety of levels of abstraction to target for performance work.  My original implementation only focused on micro optimizations like avoiding memory allocations, etc.  (And performance profiles showed that the largest single contributor was sqrt, but I didn't have a "fix" for that.)  However, especially within a higher-level language, many of these choices are denied and so my improvements have taken first to the algorithmic level.

Two (or more) posts will yet follow.  The first on finding and handling a sufficiently interesting scene to render.  And the second will be focused on reducing the computations required to render this scene.

No comments: