When cooperating on a prior work of research, Brainy: effective selection of data structures, I learned about the cost of selecting particular data structures. I expect that every Computer Scientist knows the general cost of using standard data structures like trees and linked lists. Now, the extension that Brainy provided was to recognize that a "tree" could have different implementations and these implementations can have different costs for a given workload. I, and I expect others, learned about AVL, splay, red-black and other tree types.
All this is good until your type hides its implementation and you reason about it in abstract. A map is commonly misinterpreted by its implementation. A map is a key-value store, where a "key" has an associated "value". An address book can be understood as a map of name to address / phone number / etc.
Now, the map is assumed to be implemented as a hash table, to give O(1) look-up cost. However in the C++ standard library, map uses a red-black tree for O(log n). Before you go and request that the libraries be changed, understand that experimental measurements suggest that the red-black implementation wins out when n > 500, as the hash collisions dominate the idealized O(1) cost. This is the problem that Brainy attempts to solve: the programmer selects the functionality, e.g., map and Brainy selects the best implementation given your program and architecture.
So go ahead and use a map, but recognize that you may not have O(1) cost and definitely won't as n grows increasingly large.
No comments:
Post a Comment