(contains Amazon affiliate link)
I recently found, The Practice of Programming (Addison-Wesley Professional Computing Series), sitting on the shelf at my local library. I am generally skeptical when it comes to programming books, and particularly those from different decades, but I trusted the name "Brian Kernighan" so I checked the book out.
And I am so glad that I did. From the first chapter that discussed style, I wanted to read more. And the only reason to ever stop reading was to pull out a computer and put these things into practice. I didn't even mind that it wasn't until chapter 7 that performance was discussed. Still, I will readily acknowledge that I disagree with some of statements in the book. Furthermore, there are some parts of the text that are clearly dated, like discussing current C / C++ standards.
I'd like to conclude with a brief code snippet from the work. This code is part of a serializer / deserializer. Such routines are always a pain to write and particularly if you have many different classes / structs that need them. Thus the authors suggest using vargs and writing a single routine that can handle this for you. Here is the unpack (i.e., deserialize) routine:
/* unpack: unpack packed items from buf, return length */
int unpack(uchar *buf, char *fmt, ...)
{
va_list args;
char *p;
uchar *bp, *pc;
ushort *ps;
ulong *pl;
bp = buf;
va_start(args, fmt);
for (p = fmt; *p != '\0'; p++) {
switch (*p) {
case 'c': /* char */
pc = va_arg(args, uchar*);
*pc = *bp++;
break;
case 's': /* short */
ps = va_arg(args, ushort*);
*ps = *bp++ << 8;
*ps |= *bp++;
break;
case 'l': /* long */
pl = va_arg(args, ulong*);
*pl = *bp++ << 24;
*pl |= *bp++ << 16;
*pl |= *bp++ << 8;
*pl |= *bp++;
default: /* illegal type character */
va_end(args);
return -1;
}
}
va_end(args);
return bp - buf;
}
So now we have a little language for describing the format of the data in the buffer. We invoke unpack with a string like "cscl" and pointers to store the char, short, char and long. Hah! That's it. Anytime we add new types, we just to call the pack / unpack.
Does it matter that the variables are only sequences like "pl" or "bp"? No. Variable names should be meaningful and consistent. "i" can be fine for a loop iterator.
We have given up some performance (*gasp*), but gained in the other parts that matter like readability and maintainability. I plan on using this in my current research (but only the unpack, as my serializers are already highly optimized). All in all, I approve of this book and may even someday require it as a textbook for students.
A discussion of how to do Computer Science well, particularly writing code and architecting program solutions.
Showing posts with label design. Show all posts
Showing posts with label design. Show all posts
Monday, July 14, 2014
Monday, March 3, 2014
Ongoing Goto Wars
Recently, Apple announced a security fix for their SSL implementation. I would have taken little note here, except WIRED's summary indicated the cause being the use of goto. Ah, goto, how useful you are. How abused you are.
In practice, I would also have used goto, if I was writing a series of conditionals where I only have a single error block. The code could be restructured to link all of the conditionals into a single clause, yet it also tries to retain the specific error code for the check that failed. I would really dislike to see the following:
Leave me to shrug. I'll use my goto, sparingly. And I'll continue to try to impress on students that they and all of us programmers will make mistakes so write elegant code to help find these bugs. Until then, if anyone has a good construction for the error code that doesn't require goto, let me know.
In practice, I would also have used goto, if I was writing a series of conditionals where I only have a single error block. The code could be restructured to link all of the conditionals into a single clause, yet it also tries to retain the specific error code for the check that failed. I would really dislike to see the following:
if ((e = foo) != 0 && (e = bar) != 0 && ...) goto fail;So what then? Do we call upon the software engineering folk to better analyze programs and find these mistakes? Or write style guidelines that require braces for every if, and ban every goto? And then the employee's manager is required to sign off on every exception. These are some of the many proposals floated for addressing this, and often made by programmers touting the superiority of their proposal and how this would never happen to them.
Leave me to shrug. I'll use my goto, sparingly. And I'll continue to try to impress on students that they and all of us programmers will make mistakes so write elegant code to help find these bugs. Until then, if anyone has a good construction for the error code that doesn't require goto, let me know.
Labels:
C,
C++,
design,
discussion,
exceptions,
goto,
link,
style
Sunday, December 19, 2010
Ray Tracing Performance and Practice Part 2
In this post, we shall explore creating a scene worth rendering. In my prior work, my ray tracers were creating simple assemblages of spheres and triangles, with some having textures and others mirrored surfaces, etc. This time, I wanted to have a scene that others might enjoy. So for my first step, I turned to the Stanford 3D Scanning Repository and selected the angel model, Lucy.
The first problem in rendering is that the model is about 116 million triangles, or over 500MB in size. Perhaps slightly more detail than is required for all but the largest of images. Some time was spent searching for a tool that would provide a simplified model, and I settled on MeshLab. However, the GUI version would crash trying to load the initial model, so I had to create a script of the required transformations and have the command-line application create the reduced model. This reduced down to a 250k model, and subsequent transformations in the GUI gave a 50k triangle model and oriented appropriately for the ray tracer.
The second problem in rendering is that the model is stored in the ply file format. I extended my scene parser to handle ply files without really understanding them. While I would like to have a generalized parser, I confined my work to the specific format that MeshLab was generating and did not worry about whether there were triangles or rectangles, normals present, colors specified, etc. Yet, even this single format took some work.
Initially, the parser took in a text version of the ply file; however, the time to parse 100k lines of text and convert to the appropriate representations was approaching the time to render the scene. Switching to a binary representation seemed a reasonable step; however, how does one take a stream of bytes and convert to a specific type? In C, you could do the following:
But in C#? I eventually settled on reading each value and converting via the "BitConverter" class.
Support.Assert(4 == pFile.BaseStream.Read(data, 0, 4),
"INVALID Ply - Fail to parse vertex");
newV.x = BitConverter.ToSingle(data, 0);
Great! The statue is now in the tracer. However, 50k triangles is too much for my naive ray tracer. So there was performance work to be done. At the time, I would render a 5k triangle version as a test model, as seen below (pending while in flight).
The first problem in rendering is that the model is about 116 million triangles, or over 500MB in size. Perhaps slightly more detail than is required for all but the largest of images. Some time was spent searching for a tool that would provide a simplified model, and I settled on MeshLab. However, the GUI version would crash trying to load the initial model, so I had to create a script of the required transformations and have the command-line application create the reduced model. This reduced down to a 250k model, and subsequent transformations in the GUI gave a 50k triangle model and oriented appropriately for the ray tracer.
The second problem in rendering is that the model is stored in the ply file format. I extended my scene parser to handle ply files without really understanding them. While I would like to have a generalized parser, I confined my work to the specific format that MeshLab was generating and did not worry about whether there were triangles or rectangles, normals present, colors specified, etc. Yet, even this single format took some work.
Initially, the parser took in a text version of the ply file; however, the time to parse 100k lines of text and convert to the appropriate representations was approaching the time to render the scene. Switching to a binary representation seemed a reasonable step; however, how does one take a stream of bytes and convert to a specific type? In C, you could do the following:
Vec3f v; // Struct of three floats
fread(&v, sizeof(float), 3, sceneFile);
But in C#? I eventually settled on reading each value and converting via the "BitConverter" class.
Support.Assert(4 == pFile.BaseStream.Read(data, 0, 4),
"INVALID Ply - Fail to parse vertex");
newV.x = BitConverter.ToSingle(data, 0);
Great! The statue is now in the tracer. However, 50k triangles is too much for my naive ray tracer. So there was performance work to be done. At the time, I would render a 5k triangle version as a test model, as seen below (pending while in flight).
Thursday, December 16, 2010
Ray Tracing Performance and Practice Part 1
As the MICRO conference came to a close, my thoughts were stimulated toward new projects and undertakings. One task in particular was developing a ray tracer as an additional workload for my research. The first cause for specifically a ray tracer is that I developed one in C many years ago that used now deprecated libraries. Secondly, prior to my last paper deadline, another professor suggested ray tracers as having many useful characteristics. So I've been porting my old code from C to C#. This series of posts will discuss some design decisions relating to the porting, additional optimizations that have been introduced, and other interesting headaches / pitfalls.
On the first order, porting the implementations was fairly straight forward. The original code had some level of object-oriented (OO) design with triangles and spheres both encapsulated within "ray objects" (see Object Oriented C Part1). But much of the implementation was a mangled mess. For example, both colors and positions were represented by float triplets. And so, some glue logic was required to convert from the ARGB color mode into the float space used by the tracer. But being object-oriented, I could setup type conversions and operator overloads that significantly reduced the inelegance of the original implementation.
Now with a reasonable working port of the original code, the question arose: how to make it perform well? The original used all sorts of tricks some useful like bounding spheres around the triangles and some not, like preallocating the intermediate space. As always, there are a variety of levels of abstraction to target for performance work. My original implementation only focused on micro optimizations like avoiding memory allocations, etc. (And performance profiles showed that the largest single contributor was sqrt, but I didn't have a "fix" for that.) However, especially within a higher-level language, many of these choices are denied and so my improvements have taken first to the algorithmic level.
Two (or more) posts will yet follow. The first on finding and handling a sufficiently interesting scene to render. And the second will be focused on reducing the computations required to render this scene.
On the first order, porting the implementations was fairly straight forward. The original code had some level of object-oriented (OO) design with triangles and spheres both encapsulated within "ray objects" (see Object Oriented C Part1). But much of the implementation was a mangled mess. For example, both colors and positions were represented by float triplets. And so, some glue logic was required to convert from the ARGB color mode into the float space used by the tracer. But being object-oriented, I could setup type conversions and operator overloads that significantly reduced the inelegance of the original implementation.
Now with a reasonable working port of the original code, the question arose: how to make it perform well? The original used all sorts of tricks some useful like bounding spheres around the triangles and some not, like preallocating the intermediate space. As always, there are a variety of levels of abstraction to target for performance work. My original implementation only focused on micro optimizations like avoiding memory allocations, etc. (And performance profiles showed that the largest single contributor was sqrt, but I didn't have a "fix" for that.) However, especially within a higher-level language, many of these choices are denied and so my improvements have taken first to the algorithmic level.
Two (or more) posts will yet follow. The first on finding and handling a sufficiently interesting scene to render. And the second will be focused on reducing the computations required to render this scene.
Wednesday, December 1, 2010
Elegance in Structure
Recently, I came across a request for an elegant implementation of linked lists (again on stack overflow). And now that I'm finally through conference deadlines, etc, let's consider this request in further detail.
Specifically, the issue is a singly linked, circular list. I'd love to hear the benefit of making a singly linked list circular, but I'll assume that there is a reason. Therefore, we'll explore three design questions.
The second issue is preparing an elegant insertion of elements. Two design questions are present: whether to order, and if the list is unordered then is it a queue or a stack. Based on the poster's design, we have an unordered queue. Unfortunately, this gives an unpleasant O(n) insertion cost, unless we can include a tail pointer. Since elegance encompasses more than simplicity, elegant tail insertion should have a pointer whenever possible. The potential inelegance of the implementations (for insertion and removal) is dealing with the three different insertion cases: zero elements, one elements, many elements. Can we do it in fewer?
The final elegance piece rests on element removal. Again, we're trying to reduce the inelegance by combining the different cases into fewer tests and code. The original implementation has three cases: remove only element, remove head, and remove element. The distinguishing step about removing the head element is that the head pointer needs to be updated. Additionally, when any element is removed, the previous element needs to be updated. However, we already have the previous element for the head, the tail! Thus, element removal has a common case of finding the element, and then a couple of quick checks based on that element.
Initially, I wanted to treat the list implicitly, but I find myself liking the inclusion of a tail pointer. Yet on small lists, the trade-offs may be different. I wouldn't give this as the be all and end all of circular lists, but I would present this as my generic implementation.
Specifically, the issue is a singly linked, circular list. I'd love to hear the benefit of making a singly linked list circular, but I'll assume that there is a reason. Therefore, we'll explore three design questions.
- Layout of the structure
- Inserting elements
- Deleting elements
The second issue is preparing an elegant insertion of elements. Two design questions are present: whether to order, and if the list is unordered then is it a queue or a stack. Based on the poster's design, we have an unordered queue. Unfortunately, this gives an unpleasant O(n) insertion cost, unless we can include a tail pointer. Since elegance encompasses more than simplicity, elegant tail insertion should have a pointer whenever possible. The potential inelegance of the implementations (for insertion and removal) is dealing with the three different insertion cases: zero elements, one elements, many elements. Can we do it in fewer?
typedef List struct list;
typedef Node struct node;
bool Insert(List* L, int value)
{
Node* N = (Node*) malloc(sizeof(Node));
if (N == NULL) return false;
N->value = value;
if (L->head == NULL) // case zero elements
{
L->head = L->tail = N->next = N;
}
else // cases 1+ elements
{
L->tail->next = N;
L->tail = N;
N->next = L->head; // keep circular
}
return true;
}
The final elegance piece rests on element removal. Again, we're trying to reduce the inelegance by combining the different cases into fewer tests and code. The original implementation has three cases: remove only element, remove head, and remove element. The distinguishing step about removing the head element is that the head pointer needs to be updated. Additionally, when any element is removed, the previous element needs to be updated. However, we already have the previous element for the head, the tail! Thus, element removal has a common case of finding the element, and then a couple of quick checks based on that element.
void Remove(List* L, int value)
{
Node* T, E, P;
if (L == NULL) return;
T = L->head;
P = E = L->tail;
if (T == NULL) return;
do {
if (T->value == value) break;
P = T;
T = T->next;
} while (T != E);
// Ideally, the break would skip into this block
if (T->value == value)
{
P->next = T->next; // Remove elem
if (T == E) L->tail = P; // Removed tail
if (P == E) L->head = P->next; // Removed head
if (T == P) {L->head = L->tail = NULL;} // Removed only elem
free(T); // Cleanup
}
}Initially, I wanted to treat the list implicitly, but I find myself liking the inclusion of a tail pointer. Yet on small lists, the trade-offs may be different. I wouldn't give this as the be all and end all of circular lists, but I would present this as my generic implementation.
Thursday, September 16, 2010
Object-Oriented C (part 3)
In this exciting conclusion to the object-oriented C programming series, I will discuss a couple of drawbacks to my prior solution, as well as compare my implementation to those provided by OO compilers.
Drawback #1: Branch Prediction
It is important to remember that a branch is any instruction that changes the instruction pointer, which includes function calls. Most functional calls are straightforward; however, some branches use a register (i.e., runtime value) for their destination, most commonly with function pointers and switch statements. Ray tracers may already use oct-trees to reduce the set of intersection objects. Further sorting a set of objects into their different types would present an easier to predict sequence of indirect function calls.
Drawback #2: Inheritance is Explicit
All of the OO inheritance is up to the programmer. There is no assistance from the compiler to ensure that classes inherit correctly, nor for "constructors" or protection of fields. This forces the programmer(s) to be quite disciplined in their usage. Under particular circumstances, I'd not worry about this, but in general I would want the compiler to keep an eye on the usage of this model.
To reprise, for someone who has limited experience with C++ and its notation (like I did at the time), using the limited OO capabilities provided by this solution proved ideal.
All of this work hinged on an important "law of programming": If a compiler / run-time can do something implicitly, the programmer can do so explicitly. (See More Rules). Because a compiler for C++ is generating machine code to support OO, there must be a way to write code to give an equivalent compilation in C. So then what does the C++ compiler generate? (As someone who has not written a compiler and is not disassembling C++ programs, the following will remain speculation, though clarifying comments are appreciated).
I can conceive of three ways for a compiler to store an object's functional information:
1) Type field. The type of an instance is always clearly known, but types must be globally unique. Any code wishing to perform operations needs multiple levels of indirection to get from a "type" to the appropriate function. Bleh.
2) Direct inlined fields. This is effectively the solution in C.
3) Indirect fields. Keep a single table of function pointers for this type in memory. While this saves the memory for a couple of function pointers, there is an additional layer of indirection that must be traversed on every function call.
My remaining question is: "how does a compiler / run-time address 'diamond' inheritance"? That is, classes that inherit from multiple distinct classes.
Drawback #1: Branch Prediction
It is important to remember that a branch is any instruction that changes the instruction pointer, which includes function calls. Most functional calls are straightforward; however, some branches use a register (i.e., runtime value) for their destination, most commonly with function pointers and switch statements. Ray tracers may already use oct-trees to reduce the set of intersection objects. Further sorting a set of objects into their different types would present an easier to predict sequence of indirect function calls.
Drawback #2: Inheritance is Explicit
All of the OO inheritance is up to the programmer. There is no assistance from the compiler to ensure that classes inherit correctly, nor for "constructors" or protection of fields. This forces the programmer(s) to be quite disciplined in their usage. Under particular circumstances, I'd not worry about this, but in general I would want the compiler to keep an eye on the usage of this model.
To reprise, for someone who has limited experience with C++ and its notation (like I did at the time), using the limited OO capabilities provided by this solution proved ideal.
All of this work hinged on an important "law of programming": If a compiler / run-time can do something implicitly, the programmer can do so explicitly. (See More Rules). Because a compiler for C++ is generating machine code to support OO, there must be a way to write code to give an equivalent compilation in C. So then what does the C++ compiler generate? (As someone who has not written a compiler and is not disassembling C++ programs, the following will remain speculation, though clarifying comments are appreciated).
I can conceive of three ways for a compiler to store an object's functional information:
1) Type field. The type of an instance is always clearly known, but types must be globally unique. Any code wishing to perform operations needs multiple levels of indirection to get from a "type" to the appropriate function. Bleh.
2) Direct inlined fields. This is effectively the solution in C.
3) Indirect fields. Keep a single table of function pointers for this type in memory. While this saves the memory for a couple of function pointers, there is an additional layer of indirection that must be traversed on every function call.
My remaining question is: "how does a compiler / run-time address 'diamond' inheritance"? That is, classes that inherit from multiple distinct classes.
Tuesday, September 7, 2010
Object-Oriented C (part 2)
In part 1, I outlined the problem with writing a ray tracer in C, particularly how to opaque the presence of different objects. Since my roommate was a TA, he and I discussed some of the crude ways he feared that he would have to grade. For example, the intersect calls might be based on switch ... case or an if ... else chain, as follows:
While it clearly handles the different object types, it sets up regular patterns of duplicated code. Once for intersection, then for finding a color, etc. Each pattern also has to be modified if a new scene object is introduced. But I wanted something better, particularly I wanted to show that carrying a type field is unnecessary for my design. I concluded that I could have my scene objects carry their function pointers. (Note, the following code is still pseudo in nature and is provided for illustration and not compilation purposes).
Then I declare my sphere scene object (and similarly for other scene objects), as follows:
Now, my scene object types have inherited from the base object. Therefore, the intersection loop from part one now has a single line intersection call:
dist = object->Intersect(object, ray);
This approach still has drawbacks, but it achieves a simple inheritance between structs in C, thereby providing a rudimentary OO capability to a language known for not having one. In part 3, I look forward to exploring some of the drawbacks, as well as the elusive discussion of how OO intrinsically works in C++ or other languages.
switch (object->type)
{
case TRIANGLE: ...
case SPHERE: ...
...
}While it clearly handles the different object types, it sets up regular patterns of duplicated code. Once for intersection, then for finding a color, etc. Each pattern also has to be modified if a new scene object is introduced. But I wanted something better, particularly I wanted to show that carrying a type field is unnecessary for my design. I concluded that I could have my scene objects carry their function pointers. (Note, the following code is still pseudo in nature and is provided for illustration and not compilation purposes).
typedef struct _SCENE_OBJECT {
float (*Intersect) (pSCENE_OBJECT, Vec3);
void (*Color) (pSCENE_OBJECT, Vec3 ...);
pSCENE_OBJECT next;
} SCENE_OBJECT, *pSCENE_OBJECT;
Then I declare my sphere scene object (and similarly for other scene objects), as follows:
typedef struct _SCENE_OBJECT_SPHERE {
SCENE_OBJECT generic;
....
} SCENE_OBJECT_SPHERE, *pSCENE_OBJECT_SPHERE;
Now, my scene object types have inherited from the base object. Therefore, the intersection loop from part one now has a single line intersection call:
dist = object->Intersect(object, ray);
This approach still has drawbacks, but it achieves a simple inheritance between structs in C, thereby providing a rudimentary OO capability to a language known for not having one. In part 3, I look forward to exploring some of the drawbacks, as well as the elusive discussion of how OO intrinsically works in C++ or other languages.
Monday, September 6, 2010
Object-Oriented C (part 1)
For most programmers, the very idea of writing object-oriented (OO) C code seems wrong on several levels. First, there are several, quite usable, programming languages that already support OO designs, so why mess with adding something to C that is as core to a language as OO? Second, how would someone do this anyway?
Disclaimer: I am not recommending that OO programming be generally accomplished in C, but rather am writing to argue how some OO constructs can be integrated into a program to improve its design / readability (i.e., elegance).
Many years ago when I was an undergraduate in Computer Science, one assigned project was to write a ray tracer. The provided framework was in C++, but having not studied the details of inheritance, overloading, etc, I elected to rewrite the framework in C. Since the framework was very simple, the few classes were merely renamed to struct and member functions provided a prefix. Yet, this is not the interesting part, merely the starting point.
In brief, ray tracers cast (i.e., test for intersection) a ray (i.e., vector) against the scene (i.e., a list of objects). The object that is first intersected provides the color for that part of the image. Different objects will have different intersection tests / other properties. In pseudo-C, the code for intersection might read:
The fundamental design problem is therefore how to properly set up the objects so that the ray tracer's operations upon its scene objects are still elegant. How to accomplish this will be discussed in part 2, along with how OO intrinsically works.
Disclaimer: I am not recommending that OO programming be generally accomplished in C, but rather am writing to argue how some OO constructs can be integrated into a program to improve its design / readability (i.e., elegance).
Many years ago when I was an undergraduate in Computer Science, one assigned project was to write a ray tracer. The provided framework was in C++, but having not studied the details of inheritance, overloading, etc, I elected to rewrite the framework in C. Since the framework was very simple, the few classes were merely renamed to struct and member functions provided a prefix. Yet, this is not the interesting part, merely the starting point.
In brief, ray tracers cast (i.e., test for intersection) a ray (i.e., vector) against the scene (i.e., a list of objects). The object that is first intersected provides the color for that part of the image. Different objects will have different intersection tests / other properties. In pseudo-C, the code for intersection might read:
object = objectList->head;
minDist = INF; minObject = NULL;
while (object != NULL) {
dist = Intersect(object, ray);
if (dist < minDist) {... }
object = object->next;
}The fundamental design problem is therefore how to properly set up the objects so that the ray tracer's operations upon its scene objects are still elegant. How to accomplish this will be discussed in part 2, along with how OO intrinsically works.
Subscribe to:
Posts (Atom)