Wednesday, December 1, 2010

Elegance in Structure

Recently, I came across a request for an elegant implementation of linked lists (again on stack overflow).  And now that I'm finally through conference deadlines, etc, let's consider this request in further detail.

Specifically, the issue is a singly linked, circular list.  I'd love to hear the benefit of making a singly linked list circular, but I'll assume that there is a reason.  Therefore, we'll explore three design questions.
  1. Layout of the structure
  2. Inserting elements
  3. Deleting elements
Structure layout revolves around a key issue that will decide the rest of the design.  Is the head element tracked in a particular way?  To specifically track the head element, the data structure would need two pieces: a container and individual elements.  This is the questioner's design.  Alternatively, one would treat the head implicitly as the first element, and solely use the "node" structure.  The advantage is that there is only one data structure and the necessary code for managing the special "head" is already in place.  However, with an implicit constraint, the data structure can break if it isn't obeyed.  This constraint is the container structure, but without the compiler's support / guarantees.

The second issue is preparing an elegant insertion of elements.  Two design questions are present: whether to order, and if the list is unordered then is it a queue or a stack.  Based on the poster's design, we have an unordered queue.  Unfortunately, this gives an unpleasant O(n) insertion cost, unless we can include a tail pointer.  Since elegance encompasses more than simplicity, elegant tail insertion should have a pointer whenever possible.  The potential inelegance of the implementations (for insertion and removal) is dealing with the three different insertion cases: zero elements, one elements, many elements.  Can we do it in fewer?

typedef List struct list;
typedef Node struct node;
bool Insert(List* L, int value)
{
    Node* N = (Node*) malloc(sizeof(Node));

    if (N == NULL) return false;

    N->value = value;

    if (L->head == NULL)  // case zero elements
    {
        L->head = L->tail = N->next = N;
    }
    else  // cases 1+ elements
    {
        L->tail->next = N;
        L->tail = N;
        N->next = L->head;  // keep circular
    }

    return true;
}

The final elegance piece rests on element removal. Again, we're trying to reduce the inelegance by combining the different cases into fewer tests and code.  The original implementation has three cases: remove only element, remove head, and remove element.  The distinguishing step about removing the head element is that the head pointer needs to be updated.  Additionally, when any element is removed, the previous element needs to be updated.  However, we already have the previous element for the head, the tail!  Thus, element removal has a common case of finding the element, and then a couple of quick checks based on that element.

void Remove(List* L, int value)
{
    Node* T, E, P;

    if (L == NULL) return;

    T = L->head;
    P = E = L->tail;

    if (T == NULL) return;

    do {
        if (T->value == value) break;
        P = T;
        T = T->next;
    } while (T != E);

    // Ideally, the break would skip into this block
    if (T->value == value)
    {
        P->next = T->next;   // Remove elem
        if (T == E) L->tail = P;  // Removed tail
        if (P == E) L->head = P->next;  // Removed head
        if (T == P) {L->head = L->tail = NULL;} // Removed only elem
        free(T);  // Cleanup
    }
}

Initially, I wanted to treat the list implicitly, but I find myself liking the inclusion of a tail pointer. Yet on small lists, the trade-offs may be different.  I wouldn't give this as the be all and end all of circular lists, but I would present this as my generic implementation.

No comments: