 |
|

 |
 |
 |
 |
 |
|
 |
 |
The other priority queue implementations were meant to avoid the O(N) task of finding a priority within the implementation used by both Riak and RabbitMQ. The previous results showed that "priority_queue" is best for using 1 priority, "pqueue" is best for using 2 to 41 priorities, and "pqueue3" is best for using 64 or more priorities. Both the "pqueue" data structure and the "pqueue3" data structure benefited from O(1) lookup time within an Erlang tuple. However, the "pqueue" data structure has a static limitation of 41 priorities, whereas the "pqueue3" data structure supports an arbitrary number of priorities (at creation time). To merge the best of both data structures, I created the "pqueue4" data structure, which has a static limitation of 257 priorities but is implemented in a way that is similar to "pqueue". The "pqueue4" data structure supports the priorities -128 (high) to +128 (low) and compares favorably with both "pqueue" and "pqueue3". The "pqueue4" data structure is faster than the "pqueue3" data structure, but is slightly slower than the "pqueue" data structure (due to its size). The "pqueue4" provides the best performance if 257 priorities need to be supported, and the latency will never grow based on the size of the priority queue (like it does with the "priority_queue" and "pqueue2" data structures). The benchmark was ran (with R14B04, without HiPE) on an AMD Phenom 9950 Quad-Core (64 bit) running Linux 2.6.32-23-generic (Ubuntu).
TEST pqueue_priority0
N == 1000000 (10 runs)
pqueue get: 449060.6 µs ( 1.7), set: 363759.6 µs ( 2.3)
pqueue2 get: 354847.0 µs ( 1.4), set: 413158.5 µs ( 2.6)
pqueue3 get: 867085.4 µs ( 3.4), set: 815489.8 µs ( 5.2)
pqueue4 get: 540231.5 µs ( 2.1), set: 466625.8 µs ( 3.0)
priority_queue get: 257257.6 µs ( 1.0), set: 156641.0 µs ( 1.0)
TEST pqueue_priorities2
N == 1000000 (10 runs)
2x pqueue get: 457418.0 µs ( 1.3), set: 456008.8 µs ( 1.0)
2x pqueue2 get: 357993.1 µs ( 1.0), set: 534914.7 µs ( 1.2)
2x pqueue3 get: 879511.6 µs ( 2.5), set: 768901.8 µs ( 1.7)
2x pqueue4 get: 590378.9 µs ( 1.6), set: 530402.6 µs ( 1.2)
2x priority_queue get: 428915.3 µs ( 1.2), set: 519009.0 µs ( 1.1)
TEST pqueue_priorities41
N == 1000000 (10 runs)
41 pqueue get: 482050.6 µs ( 1.5), set: 485457.7 µs ( 1.0)
41 pqueue2 get: 329309.2 µs ( 1.0), set: 2489879.6 µs ( 5.1)
41 pqueue3 get: 928454.4 µs ( 2.8), set: 834174.2 µs ( 1.7)
41 pqueue4 get: 682014.1 µs ( 2.1), set: 646907.9 µs ( 1.3)
41 priority_queue get: 398687.1 µs ( 1.2), set: 1600208.7 µs ( 3.3)
TEST pqueue_priorities64
N == 1000000 (10 runs)
64 pqueue2 get: 326512.2 µs ( 1.0), set: 3582432.8 µs ( 5.5)
64 pqueue3 get: 875909.0 µs ( 2.7), set: 827656.1 µs ( 1.3)
64 pqueue4 get: 668160.2 µs ( 2.0), set: 647776.4 µs ( 1.0)
64 priority_queue get: 395522.9 µs ( 1.2), set: 2238489.8 µs ( 3.5)
|
 |
 |
 |
 |
|
 |
 |

 |
 |
 |
 |
 |
|
 |
 |
After looking at various priority queue implementation possibilities and creating 3 separate implementations, it became necessary to test the data structures for correctness. Jesper Louis Andersen added PropEr integration, which provided randomized test cases to determine correctness, based on a priority queue model. Then, it became clear that the efficiency testing needed to focus on separate cases for different priority usage patterns. The main cases are usage of: a single priority, 2 priorities, 41 priorities, and 64 priorities, where all the priorities are evenly distributed with random, concurrent usage. The results indicate that "priority_queue" is best for using 1 priority, "pqueue" is best for using 2 to 41 priorities, and "pqueue3" is best for using 64 or more priorities. The benchmark was ran (with R14B04, without HiPE) on an AMD Phenom 9950 Quad-Core (64 bit) running Linux 2.6.32-23-generic (Ubuntu).
TEST pqueue_priority0
N == 1000000 (10 runs)
pqueue get: 444087.9 µs ( 1.8), set: 378655.3 µs ( 2.3)
pqueue2 get: 346286.3 µs ( 1.4), set: 407807.4 µs ( 2.4)
pqueue3 get: 877157.6 µs ( 3.6), set: 810117.1 µs ( 4.8)
priority_queue get: 245788.8 µs ( 1.0), set: 168242.6 µs ( 1.0)
TEST pqueue_priorities2
N == 1000000 (10 runs)
2x pqueue get: 437337.3 µs ( 1.2), set: 430425.9 µs ( 1.0)
2x pqueue2 get: 363276.3 µs ( 1.0), set: 514642.0 µs ( 1.2)
2x pqueue3 get: 871478.7 µs ( 2.4), set: 759261.5 µs ( 1.8)
2x priority_queue get: 419643.7 µs ( 1.2), set: 501706.6 µs ( 1.2)
TEST pqueue_priorities41
N == 1000000 (10 runs)
41 pqueue get: 538690.6 µs ( 1.6), set: 509626.7 µs ( 1.0)
41 pqueue2 get: 336863.6 µs ( 1.0), set: 2518901.4 µs ( 4.9)
41 pqueue3 get: 906567.4 µs ( 2.7), set: 816174.1 µs ( 1.6)
41 priority_queue get: 400770.3 µs ( 1.2), set: 1676239.8 µs ( 3.3)
TEST pqueue_priorities64
N == 1000000 (10 runs)
64 pqueue2 get: 337312.1 µs ( 1.0), set: 3721587.8 µs ( 4.3)
64 pqueue3 get: 923038.7 µs ( 2.7), set: 863387.5 µs ( 1.0)
64 priority_queue get: 399396.4 µs ( 1.2), set: 2178070.5 µs ( 2.5)
|
 |
 |
 |
 |
|
 |
 |

 |
 |
 |
 |
 |
|
 |
 |
Functional languages don't have mutable data (in the traditional sense) so certain data structures become more difficult to implement. A priority queue is normally implemented with a heap, however, after looking at heap implementations within Chris Okasaki's book ( "Purely Functional Data Structures"), it didn't seem correct to use an Erlang heap data structure to implement a priority queue. The previous Erlang priority queue data structure is used within both Riak and RabbitMQ (see priority_queue.erl). However, the "in" (i.e., "append") operation seemed quite slow because of its usage of a list of tuples to store the separate queues for each priority. So, I created a priority queue implementation called "pqueue" that relies on tuple storage for individual priorities. However, because I am using tuple storage, I needed to use a static range of priorities. I chose the common UNIX operating system priority range for nice (-20 (high) to 20 (low)). I was able to obtain fast "in" operations at the slight expense of the "out" operations, as shown below. The benchmark was ran (with R14B02, without HiPE) on an AMD Phenom 9950 Quad-Core (64 bit) running Linux 2.6.32-23-generic (Ubuntu).
TEST run_priority_queue
N == 1000000 (10 runs)
pqueue get: 475747.2 µs ( 1.3), set: 504762.3 µs ( 1.0)
priority_queue get: 372439.6 µs ( 1.0), set: 1466042.1 µs ( 2.9)
|
 |
 |
 |
 |
|
 |
 |

 |
 |
 |
 |
 |
|
 |
 |
I have updated results for the erlbench Erlang data structures tests for string and integer associative data storage. These results include both the trie data structure and the statically-sized hash table data structure (hashtl) I have created for more efficient string associative data access. My main justification for creating efficient Erlang data structures is to make Erlang source code more scalable by reducing its dependency on global state ( ETS). The process dictionary is globally accessible with trickery (erlang:process_info/1) and is not used in testable source code, so the process dictionary is considered to be as bad as global state. The reason the process dictionary is bad is because its scope is undefined, since it is attached to the lifetime of the Erlang process (not to mention that it complicates the fault-tolerance of any stored data). The process dictionary is a single dictionary shared with all source code the Erlang process executes, so external source code can make its contents undefined also! ETS is a more typical way to store global state as key/value pairs. With concurrency (operations that are not atomic) global state depends on locking to maintain consistency and it is the data locking that limits scalability. Erlang's main strength is its avoidance of global data locks, so it is only natural to pursue data structures that avoid global data.
(from string_key test below)
N == 64000 (10 runs)
aadict get: 233343.8 µs ( 5.4), set: 440130.2 µs ( 2.6)
dict get: 137424.3 µs ( 3.2), set: 871894.4 µs ( 5.2)
ets (ordered_set) get: 177797.4 µs ( 4.1), set: 211416.5 µs ( 1.3)
ets (set) get: 114235.0 µs ( 2.6), set: 169276.2 µs ( 1.0)
ets x10 read (ordere get: 169068.3 µs ( 3.9)
ets x10 read (set) get: 134240.8 µs ( 3.1)
gb_trees get: 255160.6 µs ( 5.9), set: 488567.1 µs ( 2.9)
hashtl get: 129433.1 µs ( 3.0), set: 210779.3 µs ( 1.3)
process dictionary get: 77023.9 µs ( 1.8), set: 166122.5 µs ( 1.0)
rbdict get: 230949.9 µs ( 5.3), set: 372831.9 µs ( 2.2)
trie get: 43435.1 µs ( 1.0), set: 379677.6 µs ( 2.3)
Just looking at string lookup results for a relatively large number of strings (64000) to hold in RAM shows that the trie string "get" is the fastest and the hashtl "set" is tied for second fastest (with both ETS and the process dictionary being the fastest alternative). You could also say that for accessing 64000 common strings, the trie and the hashtl source code are currently the fastest local string associative data storage in Erlang, if you accept that the scope of the process dictionary is non-local. The ETS test shows the extra latency that occurs with a string "get" operation on both set and ordered_set with read_concurrency set to true. The latency changes by a factor of 0.951 for the ordered_set and a factor of 1.175 for the set when the number of processes doing concurrent read operations increases from 1 to 10. So the scalability benefit of using the trie (ignoring its efficiency) rather than ETS is obvious for the set but the ordered_set requires testing with more concurrent reads to show scalability benefits. The benchmark results are from Linux version 2.6.32-21, Ubuntu 10.04 with an Intel(R) Core(TM)2 CPU T5600 @ 1.83GHz and non-HiPE Erlang R14B02:
TEST string_key
N == 50 (10 runs)
aadict get: 18.5 µs ( 1.1), set: 61.5 µs ( 1.2)
dict get: 32.1 µs ( 1.9), set: 119.4 µs ( 2.4)
ets (ordered_set) get: 30.6 µs ( 1.8), set: 50.0 µs ( 1.0)
ets (set) get: 31.1 µs ( 1.8), set: 53.9 µs ( 1.1)
ets x10 read (ordere get: 39.8 µs ( 2.4)
ets x10 read (set) get: 41.5 µs ( 2.5)
gb_trees get: 22.9 µs ( 1.4), set: 92.3 µs ( 1.8)
hashtl get: 27.4 µs ( 1.6), set: 58.7 µs ( 1.2)
process dictionary get: 19.5 µs ( 1.2), set: 74.1 µs ( 1.5)
rbdict get: 18.2 µs ( 1.1), set: 52.6 µs ( 1.1)
trie get: 16.9 µs ( 1.0), set: 108.0 µs ( 2.2)
N == 100 (10 runs)
aadict get: 43.4 µs ( 1.3), set: 147.1 µs ( 1.5)
dict get: 68.8 µs ( 2.0), set: 251.4 µs ( 2.5)
ets (ordered_set) get: 65.4 µs ( 1.9), set: 99.6 µs ( 1.0)
ets (set) get: 59.2 µs ( 1.7), set: 100.8 µs ( 1.0)
ets x10 read (ordere get: 99.6 µs ( 2.9)
ets x10 read (set) get: 77.7 µs ( 2.3)
gb_trees get: 51.3 µs ( 1.5), set: 200.4 µs ( 2.0)
hashtl get: 54.5 µs ( 1.6), set: 116.7 µs ( 1.2)
process dictionary get: 38.5 µs ( 1.1), set: 224.8 µs ( 2.3)
rbdict get: 43.4 µs ( 1.3), set: 111.2 µs ( 1.1)
trie get: 33.9 µs ( 1.0), set: 210.2 µs ( 2.1)
N == 250 (10 runs)
aadict get: 135.5 µs ( 1.5), set: 430.9 µs ( 1.9)
dict get: 178.5 µs ( 1.9), set: 709.3 µs ( 3.1)
ets (ordered_set) get: 167.5 µs ( 1.8), set: 311.4 µs ( 1.4)
ets (set) get: 148.8 µs ( 1.6), set: 250.1 µs ( 1.1)
ets x10 read (ordere get: 266.5 µs ( 2.9)
ets x10 read (set) get: 270.7 µs ( 2.9)
gb_trees get: 155.3 µs ( 1.7), set: 685.9 µs ( 3.0)
hashtl get: 161.6 µs ( 1.7), set: 229.0 µs ( 1.0)
process dictionary get: 92.6 µs ( 1.0), set: 296.4 µs ( 1.3)
rbdict get: 143.2 µs ( 1.5), set: 334.2 µs ( 1.5)
trie get: 95.3 µs ( 1.0), set: 507.2 µs ( 2.2)
N == 500 (10 runs)
aadict get: 332.5 µs ( 1.7), set: 1090.7 µs ( 1.9)
dict get: 364.6 µs ( 1.9), set: 1440.2 µs ( 2.5)
ets (ordered_set) get: 410.8 µs ( 2.1), set: 1466.0 µs ( 2.5)
ets (set) get: 302.6 µs ( 1.6), set: 586.5 µs ( 1.0)
ets x10 read (ordere get: 573.1 µs ( 3.0)
ets x10 read (set) get: 494.3 µs ( 2.6)
gb_trees get: 365.2 µs ( 1.9), set: 1334.5 µs ( 2.3)
hashtl get: 388.7 µs ( 2.0), set: 800.4 µs ( 1.4)
process dictionary get: 191.6 µs ( 1.0), set: 718.8 µs ( 1.2)
rbdict get: 330.7 µs ( 1.7), set: 755.5 µs ( 1.3)
trie get: 220.4 µs ( 1.2), set: 981.8 µs ( 1.7)
N == 1000 (10 runs)
aadict get: 798.7 µs ( 1.1), set: 2322.7 µs ( 1.5)
dict get: 1093.8 µs ( 1.5), set: 3336.9 µs ( 2.1)
ets (ordered_set) get: 1163.9 µs ( 1.6), set: 2343.4 µs ( 1.5)
ets (set) get: 976.2 µs ( 1.4), set: 1561.4 µs ( 1.0)
ets x10 read (ordere get: 1071.3 µs ( 1.5)
ets x10 read (set) get: 1118.5 µs ( 1.6)
gb_trees get: 927.4 µs ( 1.3), set: 3017.3 µs ( 1.9)
hashtl get: 1346.6 µs ( 1.9), set: 1754.8 µs ( 1.1)
process dictionary get: 747.5 µs ( 1.1), set: 1771.5 µs ( 1.1)
rbdict get: 786.5 µs ( 1.1), set: 1806.9 µs ( 1.2)
trie get: 709.1 µs ( 1.0), set: 2468.1 µs ( 1.6)
N == 2000 (10 runs)
aadict get: 2196.6 µs ( 1.2), set: 5800.6 µs ( 1.6)
dict get: 2980.3 µs ( 1.6), set: 7976.8 µs ( 2.2)
ets (ordered_set) get: 3098.8 µs ( 1.6), set: 4165.5 µs ( 1.1)
ets (set) get: 2966.1 µs ( 1.6), set: 4373.5 µs ( 1.2)
ets x10 read (ordere get: 2640.6 µs ( 1.4)
ets x10 read (set) get: 2850.6 µs ( 1.5)
gb_trees get: 2320.3 µs ( 1.2), set: 6872.1 µs ( 1.9)
hashtl get: 3581.9 µs ( 1.9), set: 3662.4 µs ( 1.0)
process dictionary get: 2389.5 µs ( 1.3), set: 4164.2 µs ( 1.1)
rbdict get: 2076.3 µs ( 1.1), set: 4576.0 µs ( 1.2)
trie get: 1907.3 µs ( 1.0), set: 5780.9 µs ( 1.6)
N == 4000 (10 runs)
aadict get: 5877.5 µs ( 1.3), set: 13809.0 µs ( 1.7)
dict get: 6491.3 µs ( 1.5), set: 18633.8 µs ( 2.4)
ets (ordered_set) get: 7127.5 µs ( 1.6), set: 9145.6 µs ( 1.2)
ets (set) get: 6153.2 µs ( 1.4), set: 7919.1 µs ( 1.0)
ets x10 read (ordere get: 8061.6 µs ( 1.8)
ets x10 read (set) get: 7786.1 µs ( 1.8)
gb_trees get: 6424.2 µs ( 1.5), set: 15693.2 µs ( 2.0)
hashtl get: 7054.5 µs ( 1.6), set: 8605.5 µs ( 1.1)
process dictionary get: 4938.2 µs ( 1.1), set: 9578.4 µs ( 1.2)
rbdict get: 5788.3 µs ( 1.3), set: 11456.6 µs ( 1.4)
trie get: 4376.9 µs ( 1.0), set: 13757.2 µs ( 1.7)
N == 8000 (10 runs)
aadict get: 15730.6 µs ( 1.8), set: 34177.0 µs ( 2.1)
dict get: 14520.8 µs ( 1.6), set: 44817.0 µs ( 2.7)
ets (ordered_set) get: 16009.9 µs ( 1.8), set: 19875.4 µs ( 1.2)
ets (set) get: 13122.3 µs ( 1.5), set: 16436.0 µs ( 1.0)
ets x10 read (ordere get: 19029.0 µs ( 2.1)
ets x10 read (set) get: 16993.7 µs ( 1.9)
gb_trees get: 17208.8 µs ( 1.9), set: 36673.0 µs ( 2.2)
hashtl get: 15175.1 µs ( 1.7), set: 16599.3 µs ( 1.0)
process dictionary get: 10157.7 µs ( 1.1), set: 21110.6 µs ( 1.3)
rbdict get: 15758.7 µs ( 1.8), set: 28566.6 µs ( 1.7)
trie get: 8940.0 µs ( 1.0), set: 28293.8 µs ( 1.7)
N == 16000 (10 runs)
aadict get: 41375.3 µs ( 2.0), set: 82925.0 µs ( 2.2)
dict get: 29842.0 µs ( 1.4), set: 108119.5 µs ( 2.9)
ets (ordered_set) get: 36909.9 µs ( 1.8), set: 43162.0 µs ( 1.2)
ets (set) get: 28240.8 µs ( 1.4), set: 37495.3 µs ( 1.0)
ets x10 read (ordere get: 40634.9 µs ( 2.0)
ets x10 read (set) get: 35608.5 µs ( 1.7)
gb_trees get: 44022.7 µs ( 2.1), set: 93643.0 µs ( 2.5)
hashtl get: 37419.0 µs ( 1.8), set: 48790.5 µs ( 1.3)
process dictionary get: 20684.6 µs ( 1.0), set: 43798.3 µs ( 1.2)
rbdict get: 40916.6 µs ( 2.0), set: 71371.2 µs ( 1.9)
trie get: 21225.2 µs ( 1.0), set: 69109.4 µs ( 1.8)
N == 32000 (10 runs)
aadict get: 97193.5 µs ( 2.3), set: 188024.4 µs ( 2.3)
dict get: 64191.5 µs ( 1.5), set: 300668.6 µs ( 3.6)
ets (ordered_set) get: 83813.1 µs ( 2.0), set: 100045.7 µs ( 1.2)
ets (set) get: 60364.4 µs ( 1.4), set: 83030.4 µs ( 1.0)
ets x10 read (ordere get: 86087.9 µs ( 2.0)
ets x10 read (set) get: 74093.9 µs ( 1.8)
gb_trees get: 107121.1 µs ( 2.5), set: 214457.9 µs ( 2.6)
hashtl get: 68200.3 µs ( 1.6), set: 112145.8 µs ( 1.4)
process dictionary get: 42152.3 µs ( 1.0), set: 93514.7 µs ( 1.1)
rbdict get: 96665.2 µs ( 2.3), set: 160379.7 µs ( 1.9)
trie get: 54760.0 µs ( 1.3), set: 213854.1 µs ( 2.6)
N == 64000 (10 runs)
aadict get: 233343.8 µs ( 5.4), set: 440130.2 µs ( 2.6)
dict get: 137424.3 µs ( 3.2), set: 871894.4 µs ( 5.2)
ets (ordered_set) get: 177797.4 µs ( 4.1), set: 211416.5 µs ( 1.3)
ets (set) get: 114235.0 µs ( 2.6), set: 169276.2 µs ( 1.0)
ets x10 read (ordere get: 169068.3 µs ( 3.9)
ets x10 read (set) get: 134240.8 µs ( 3.1)
gb_trees get: 255160.6 µs ( 5.9), set: 488567.1 µs ( 2.9)
hashtl get: 129433.1 µs ( 3.0), set: 210779.3 µs ( 1.3)
process dictionary get: 77023.9 µs ( 1.8), set: 166122.5 µs ( 1.0)
rbdict get: 230949.9 µs ( 5.3), set: 372831.9 µs ( 2.2)
trie get: 43435.1 µs ( 1.0), set: 379677.6 µs ( 2.3)
TEST integer_key
N == 50 (10 runs)
aadict get: 16.2 µs ( 3.2), set: 54.9 µs ( 5.8)
array (dynamic) get: 15.0 µs ( 2.9), set: 25.5 µs ( 2.7)
array (fixed) get: 15.3 µs ( 3.0), set: 26.5 µs ( 2.8)
dict get: 22.2 µs ( 4.4), set: 51.0 µs ( 5.4)
ets (ordered_set) get: 21.9 µs ( 4.3), set: 29.1 µs ( 3.1)
ets (set) get: 16.3 µs ( 3.2), set: 25.8 µs ( 2.7)
ets x10 read (ordere get: 20.2 µs ( 4.0)
ets x10 read (set) get: 16.8 µs ( 3.3)
gb_trees get: 17.3 µs ( 3.4), set: 104.8 µs ( 11.1)
hashtl get: 16.6 µs ( 3.3), set: 38.6 µs ( 4.1)
process dictionary get: 8.2 µs ( 1.6), set: 9.4 µs ( 1.0)
rbdict get: 15.8 µs ( 3.1), set: 43.8 µs ( 4.7)
tuple get: 5.1 µs ( 1.0), set: 13.0 µs ( 1.4)
N == 100 (10 runs)
aadict get: 43.0 µs ( 4.3), set: 128.1 µs ( 7.9)
array (dynamic) get: 28.4 µs ( 2.9), set: 51.0 µs ( 3.1)
array (fixed) get: 28.6 µs ( 2.9), set: 50.9 µs ( 3.1)
dict get: 48.6 µs ( 4.9), set: 121.6 µs ( 7.5)
ets (ordered_set) get: 42.4 µs ( 4.3), set: 57.7 µs ( 3.6)
ets (set) get: 98.6 µs ( 10.0), set: 56.5 µs ( 3.5)
ets x10 read (ordere get: 37.7 µs ( 3.8)
ets x10 read (set) get: 26.9 µs ( 2.7)
gb_trees get: 36.3 µs ( 3.7), set: 263.0 µs ( 16.2)
hashtl get: 35.3 µs ( 3.6), set: 76.6 µs ( 4.7)
process dictionary get: 15.1 µs ( 1.5), set: 16.2 µs ( 1.0)
rbdict get: 34.5 µs ( 3.5), set: 106.2 µs ( 6.6)
tuple get: 9.9 µs ( 1.0), set: 47.9 µs ( 3.0)
N == 250 (10 runs)
aadict get: 103.4 µs ( 5.2), set: 430.5 µs ( 8.9)
array (dynamic) get: 99.0 µs ( 5.0), set: 187.0 µs ( 3.9)
array (fixed) get: 96.7 µs ( 4.9), set: 195.8 µs ( 4.1)
dict get: 113.4 µs ( 5.7), set: 868.7 µs ( 18.0)
ets (ordered_set) get: 101.7 µs ( 5.1), set: 150.4 µs ( 3.1)
ets (set) get: 87.5 µs ( 4.4), set: 149.0 µs ( 3.1)
ets x10 read (ordere get: 85.0 µs ( 4.3)
ets x10 read (set) get: 64.6 µs ( 3.2)
gb_trees get: 102.2 µs ( 5.1), set: 883.4 µs ( 18.3)
hashtl get: 131.7 µs ( 6.6), set: 177.9 µs ( 3.7)
process dictionary get: 36.0 µs ( 1.8), set: 48.3 µs ( 1.0)
rbdict get: 97.0 µs ( 4.9), set: 346.5 µs ( 7.2)
tuple get: 19.9 µs ( 1.0), set: 331.4 µs ( 6.9)
N == 500 (10 runs)
aadict get: 215.0 µs ( 5.1), set: 1129.6 µs ( 12.6)
array (dynamic) get: 194.3 µs ( 4.6), set: 444.6 µs ( 5.0)
array (fixed) get: 198.1 µs ( 4.7), set: 456.6 µs ( 5.1)
dict get: 221.7 µs ( 5.2), set: 1258.3 µs ( 14.0)
ets (ordered_set) get: 220.4 µs ( 5.2), set: 308.5 µs ( 3.4)
ets (set) get: 158.3 µs ( 3.7), set: 273.8 µs ( 3.1)
ets x10 read (ordere get: 172.4 µs ( 4.1)
ets x10 read (set) get: 113.1 µs ( 2.7)
gb_trees get: 223.1 µs ( 5.3), set: 2430.2 µs ( 27.1)
hashtl get: 266.2 µs ( 6.3), set: 500.3 µs ( 5.6)
process dictionary get: 66.2 µs ( 1.6), set: 89.7 µs ( 1.0)
rbdict get: 214.4 µs ( 5.1), set: 985.1 µs ( 11.0)
tuple get: 42.4 µs ( 1.0), set: 2230.6 µs ( 24.9)
N == 1000 (10 runs)
aadict get: 472.9 µs ( 5.7), set: 1953.9 µs ( 12.8)
array (dynamic) get: 429.1 µs ( 5.1), set: 759.1 µs ( 5.0)
array (fixed) get: 424.9 µs ( 5.1), set: 1118.4 µs ( 7.3)
dict get: 436.0 µs ( 5.2), set: 2460.1 µs ( 16.1)
ets (ordered_set) get: 436.5 µs ( 5.2), set: 870.3 µs ( 5.7)
ets (set) get: 330.5 µs ( 4.0), set: 522.7 µs ( 3.4)
ets x10 read (ordere get: 335.9 µs ( 4.0)
ets x10 read (set) get: 207.3 µs ( 2.5)
gb_trees get: 494.3 µs ( 5.9), set: 4548.2 µs ( 29.7)
hashtl get: 485.6 µs ( 5.8), set: 1291.5 µs ( 8.4)
process dictionary get: 129.7 µs ( 1.6), set: 153.0 µs ( 1.0)
rbdict get: 474.4 µs ( 5.7), set: 1635.0 µs ( 10.7)
tuple get: 83.6 µs ( 1.0), set: 5586.7 µs ( 36.5)
N == 2000 (10 runs)
aadict get: 1083.5 µs ( 6.8), set: 5727.8 µs ( 9.5)
array (dynamic) get: 1113.2 µs ( 7.0), set: 2274.1 µs ( 3.8)
array (fixed) get: 985.6 µs ( 6.2), set: 2235.5 µs ( 3.7)
dict get: 904.9 µs ( 5.7), set: 6388.5 µs ( 10.6)
ets (ordered_set) get: 907.6 µs ( 5.7), set: 1283.6 µs ( 2.1)
ets (set) get: 702.4 µs ( 4.4), set: 1124.5 µs ( 1.9)
ets x10 read (ordere get: 614.8 µs ( 3.9)
ets x10 read (set) get: 384.6 µs ( 2.4)
gb_trees get: 1109.6 µs ( 7.0), set: 12070.2 µs ( 20.0)
hashtl get: 1083.8 µs ( 6.8), set: 2065.2 µs ( 3.4)
process dictionary get: 258.2 µs ( 1.6), set: 603.9 µs ( 1.0)
rbdict get: 1080.8 µs ( 6.8), set: 4763.0 µs ( 7.9)
tuple get: 159.3 µs ( 1.0), set: 39640.4 µs ( 65.6)
N == 4000 (10 runs)
aadict get: 2634.7 µs ( 8.3), set: 10477.9 µs ( 15.9)
array (dynamic) get: 1987.0 µs ( 6.3), set: 3746.9 µs ( 5.7)
array (fixed) get: 2069.9 µs ( 6.5), set: 4359.8 µs ( 6.6)
dict get: 2059.9 µs ( 6.5), set: 9022.6 µs ( 13.7)
ets (ordered_set) get: 1874.6 µs ( 5.9), set: 2586.1 µs ( 3.9)
ets (set) get: 1433.8 µs ( 4.5), set: 2526.4 µs ( 3.8)
ets x10 read (ordere get: 1222.8 µs ( 3.9)
ets x10 read (set) get: 761.1 µs ( 2.4)
gb_trees get: 2287.2 µs ( 7.2), set: 27537.3 µs ( 41.7)
hashtl get: 2361.7 µs ( 7.5), set: 3410.8 µs ( 5.2)
process dictionary get: 510.0 µs ( 1.6), set: 660.0 µs ( 1.0)
rbdict get: 2750.5 µs ( 8.7), set: 9114.6 µs ( 13.8)
tuple get: 316.2 µs ( 1.0), set: 151840.8 µs (230.1)
N == 8000 (10 runs)
aadict get: 5749.0 µs ( 9.3), set: 20115.9 µs ( 16.5)
array (dynamic) get: 3995.8 µs ( 6.4), set: 9530.1 µs ( 7.8)
array (fixed) get: 4147.7 µs ( 6.7), set: 9336.6 µs ( 7.6)
dict get: 4393.0 µs ( 7.1), set: 20980.3 µs ( 17.2)
ets (ordered_set) get: 3966.9 µs ( 6.4), set: 5457.4 µs ( 4.5)
ets (set) get: 2890.5 µs ( 4.7), set: 5415.8 µs ( 4.4)
ets x10 read (ordere get: 2441.7 µs ( 3.9)
ets x10 read (set) get: 1529.0 µs ( 2.5)
gb_trees get: 5126.7 µs ( 8.3), set: 57499.8 µs ( 47.0)
hashtl get: 5180.1 µs ( 8.4), set: 8602.0 µs ( 7.0)
process dictionary get: 1035.3 µs ( 1.7), set: 1222.7 µs ( 1.0)
rbdict get: 5657.5 µs ( 9.1), set: 17929.1 µs ( 14.7)
tuple get: 619.7 µs ( 1.0), set: 617050.7 µs (504.7)
N == 16000 (10 runs)
aadict get: 11571.6 µs ( 9.5), set: 45005.3 µs ( 15.6)
array (dynamic) get: 9951.0 µs ( 8.2), set: 24316.3 µs ( 8.4)
array (fixed) get: 9905.7 µs ( 8.1), set: 23415.4 µs ( 8.1)
dict get: 8804.0 µs ( 7.2), set: 67758.2 µs ( 23.4)
ets (ordered_set) get: 8359.1 µs ( 6.9), set: 12555.9 µs ( 4.3)
ets (set) get: 6100.3 µs ( 5.0), set: 16166.7 µs ( 5.6)
ets x10 read (ordere get: 5045.8 µs ( 4.1)
ets x10 read (set) get: 3196.1 µs ( 2.6)
gb_trees get: 10767.4 µs ( 8.8), set: 120424.8 µs ( 41.7)
hashtl get: 11302.7 µs ( 9.3), set: 25191.9 µs ( 8.7)
process dictionary get: 2081.1 µs ( 1.7), set: 2890.8 µs ( 1.0)
rbdict get: 11418.0 µs ( 9.4), set: 40875.7 µs ( 14.1)
tuple get: 1219.9 µs ( 1.0), set: 2500319.5 µs (864.9)
N == 32000 (10 runs)
aadict get: 24278.9 µs ( 9.9), set: 96279.2 µs ( 12.1)
array (dynamic) get: 19642.4 µs ( 8.0), set: 47917.9 µs ( 6.0)
array (fixed) get: 19781.2 µs ( 8.1), set: 39503.3 µs ( 5.0)
dict get: 19155.6 µs ( 7.8), set: 237609.2 µs ( 29.9)
ets (ordered_set) get: 18021.3 µs ( 7.4), set: 25457.2 µs ( 3.2)
ets (set) get: 14420.6 µs ( 5.9), set: 33051.4 µs ( 4.2)
ets x10 read (ordere get: 10584.6 µs ( 4.3)
ets x10 read (set) get: 7193.0 µs ( 2.9)
gb_trees get: 23442.1 µs ( 9.6), set: 244698.6 µs ( 30.8)
hashtl get: 25704.7 µs ( 10.5), set: 47058.7 µs ( 5.9)
process dictionary get: 4151.9 µs ( 1.7), set: 7941.7 µs ( 1.0)
rbdict get: 23069.3 µs ( 9.4), set: 85377.7 µs ( 10.8)
tuple get: 2447.1 µs ( 1.0), set: 10227316.7 µs (*****)
N == 64000 (10 runs)
aadict get: 47719.6 µs ( 9.8), set: 225636.9 µs ( 14.4)
array (dynamic) get: 38896.1 µs ( 8.0), set: 88107.9 µs ( 5.6)
array (fixed) get: 38844.9 µs ( 8.0), set: 91707.1 µs ( 5.8)
dict get: 38987.5 µs ( 8.0), set: 722098.9 µs ( 46.0)
ets (ordered_set) get: 38267.4 µs ( 7.8), set: 54992.6 µs ( 3.5)
ets (set) get: 35880.6 µs ( 7.3), set: 71159.9 µs ( 4.5)
ets x10 read (ordere get: 22375.3 µs ( 4.6)
ets x10 read (set) get: 17203.7 µs ( 3.5)
gb_trees get: 49406.1 µs ( 10.1), set: 522021.5 µs ( 33.3)
hashtl get: 40078.7 µs ( 8.2), set: 107626.0 µs ( 6.9)
process dictionary get: 8210.2 µs ( 1.7), set: 15697.8 µs ( 1.0)
rbdict get: 48208.3 µs ( 9.9), set: 216548.6 µs ( 13.8)
tuple get: 4883.4 µs ( 1.0), set: 50572301.0 µs (*****)
|
 |
 |
 |
 |
|
 |
 |



 |
 |
 |
 |
 |
|
 |
 |
After seeing the slow storage speed on my simple Erlang hash table implementation ( hasht), I attempted to create a faster data structure. With smaller nested tuples that are smaller segments of a hash table, my new hashtl (Hash Table Layered) Erlang data structure does provide faster storage than hasht. However, hashtl has slightly slower fetch times that are similar to dict fetch times (dict uses a 2 layered hash table that is dynamically sized). The hashtl performance is roughly opposite the trie data structure, since the trie has quick "get" times on string keys, while the hashtl has quick "set" times on string keys (i.e., the ratios in the parentheses for the opposite operations are roughly equivalent between the two data structures, when only considering string keys). The benchmark results are from Linux version 2.6.32-21, Ubuntu 10.04 with an Intel(R) Core(TM)2 CPU T5600 @ 1.83GHz and non-HiPE Erlang R14B01:
TEST string_key
N == 50 (10 runs)
aadict get: 19.0 µs ( 1.1), set: 58.5 µs ( 1.2)
dict get: 32.5 µs ( 1.9), set: 118.4 µs ( 2.4)
ets (set) get: 32.4 µs ( 1.9), set: 54.7 µs ( 1.1)
ets x10 (set) get: 45.2 µs ( 2.6)
gb_trees get: 21.0 µs ( 1.2), set: 92.4 µs ( 1.9)
hasht get: 21.6 µs ( 1.3), set: 177.4 µs ( 3.6)
hashtl get: 31.5 µs ( 1.8), set: 54.0 µs ( 1.1)
process dictionary get: 18.8 µs ( 1.1), set: 53.1 µs ( 1.1)
rbdict get: 19.9 µs ( 1.2), set: 49.4 µs ( 1.0)
trie get: 17.1 µs ( 1.0), set: 105.6 µs ( 2.1)
N == 100 (10 runs)
aadict get: 44.3 µs ( 1.3), set: 129.2 µs ( 1.3)
dict get: 75.2 µs ( 2.1), set: 235.2 µs ( 2.4)
ets (set) get: 62.1 µs ( 1.8), set: 103.8 µs ( 1.1)
ets x10 (set) get: 158.0 µs ( 4.5)
gb_trees get: 49.8 µs ( 1.4), set: 188.3 µs ( 1.9)
hasht get: 42.7 µs ( 1.2), set: 269.7 µs ( 2.8)
hashtl get: 56.2 µs ( 1.6), set: 97.9 µs ( 1.0)
process dictionary get: 38.1 µs ( 1.1), set: 112.4 µs ( 1.1)
rbdict get: 45.1 µs ( 1.3), set: 107.4 µs ( 1.1)
trie get: 35.2 µs ( 1.0), set: 198.5 µs ( 2.0)
N == 250 (10 runs)
aadict get: 138.8 µs ( 1.5), set: 400.4 µs ( 2.0)
dict get: 175.6 µs ( 1.9), set: 682.3 µs ( 3.4)
ets (set) get: 149.5 µs ( 1.6), set: 255.2 µs ( 1.3)
ets x10 (set) get: 259.8 µs ( 2.8)
gb_trees get: 154.2 µs ( 1.7), set: 530.6 µs ( 2.6)
hasht get: 131.7 µs ( 1.4), set: 793.0 µs ( 3.9)
hashtl get: 167.7 µs ( 1.8), set: 201.1 µs ( 1.0)
process dictionary get: 93.3 µs ( 1.0), set: 340.0 µs ( 1.7)
rbdict get: 139.0 µs ( 1.5), set: 338.8 µs ( 1.7)
trie get: 96.4 µs ( 1.0), set: 510.3 µs ( 2.5)
N == 500 (10 runs)
aadict get: 340.8 µs ( 1.8), set: 999.1 µs ( 2.0)
dict get: 372.0 µs ( 2.0), set: 1555.1 µs ( 3.1)
ets (set) get: 307.4 µs ( 1.6), set: 614.2 µs ( 1.2)
ets x10 (set) get: 461.6 µs ( 2.5)
gb_trees get: 359.9 µs ( 1.9), set: 1358.1 µs ( 2.7)
hasht get: 472.9 µs ( 2.5), set: 1968.7 µs ( 3.9)
hashtl get: 465.1 µs ( 2.5), set: 506.7 µs ( 1.0)
process dictionary get: 187.6 µs ( 1.0), set: 578.7 µs ( 1.1)
rbdict get: 327.2 µs ( 1.7), set: 810.9 µs ( 1.6)
trie get: 214.7 µs ( 1.1), set: 1148.8 µs ( 2.3)
N == 1000 (10 runs)
aadict get: 806.1 µs ( 1.3), set: 2299.6 µs ( 1.6)
dict get: 931.0 µs ( 1.5), set: 3238.2 µs ( 2.2)
ets (set) get: 804.5 µs ( 1.3), set: 1534.6 µs ( 1.0)
ets x10 (set) get: 968.7 µs ( 1.6)
gb_trees get: 876.5 µs ( 1.4), set: 2706.6 µs ( 1.9)
hasht get: 1324.3 µs ( 2.2), set: 10091.8 µs ( 6.9)
hashtl get: 1045.4 µs ( 1.7), set: 1462.6 µs ( 1.0)
process dictionary get: 604.8 µs ( 1.0), set: 1465.2 µs ( 1.0)
rbdict get: 794.7 µs ( 1.3), set: 1782.3 µs ( 1.2)
trie get: 636.7 µs ( 1.1), set: 2463.2 µs ( 1.7)
N == 2000 (10 runs)
aadict get: 2024.5 µs ( 1.1), set: 5195.9 µs ( 1.6)
dict get: 2688.9 µs ( 1.5), set: 6884.1 µs ( 2.1)
ets (set) get: 2591.4 µs ( 1.4), set: 5177.3 µs ( 1.6)
ets x10 (set) get: 2733.6 µs ( 1.5)
gb_trees get: 2226.3 µs ( 1.2), set: 6122.8 µs ( 1.9)
hasht get: 2589.5 µs ( 1.4), set: 30504.6 µs ( 9.3)
hashtl get: 2991.4 µs ( 1.6), set: 3288.4 µs ( 1.0)
process dictionary get: 2114.5 µs ( 1.2), set: 4030.5 µs ( 1.2)
rbdict get: 2025.2 µs ( 1.1), set: 4224.6 µs ( 1.3)
trie get: 1837.8 µs ( 1.0), set: 5675.4 µs ( 1.7)
N == 4000 (10 runs)
aadict get: 5926.7 µs ( 1.4), set: 12774.6 µs ( 1.8)
dict get: 6796.7 µs ( 1.5), set: 18243.7 µs ( 2.6)
ets (set) get: 6281.3 µs ( 1.4), set: 7824.5 µs ( 1.1)
ets x10 (set) get: 8046.3 µs ( 1.8)
gb_trees get: 6301.9 µs ( 1.4), set: 14376.7 µs ( 2.1)
hasht get: 5460.8 µs ( 1.2), set: 103768.9 µs ( 14.9)
hashtl get: 7137.7 µs ( 1.6), set: 6957.0 µs ( 1.0)
process dictionary get: 5115.8 µs ( 1.2), set: 10582.6 µs ( 1.5)
rbdict get: 5963.5 µs ( 1.4), set: 10554.8 µs ( 1.5)
trie get: 4386.7 µs ( 1.0), set: 13996.3 µs ( 2.0)
N == 8000 (10 runs)
aadict get: 16002.9 µs ( 1.9), set: 31812.0 µs ( 2.0)
dict get: 14557.8 µs ( 1.7), set: 45314.4 µs ( 2.8)
ets (set) get: 1.34e4 µs ( 1.6), set: 17021.6 µs ( 1.1)
ets x10 (set) get: 17172.7 µs ( 2.0)
gb_trees get: 16952.9 µs ( 2.0), set: 35821.8 µs ( 2.2)
hasht get: 11080.5 µs ( 1.3), set: 408960.8 µs ( 25.5)
hashtl get: 14790.3 µs ( 1.7), set: 16040.3 µs ( 1.0)
process dictionary get: 10445.2 µs ( 1.2), set: 21810.0 µs ( 1.4)
rbdict get: 15936.7 µs ( 1.9), set: 28578.3 µs ( 1.8)
trie get: 8543.9 µs ( 1.0), set: 27808.7 µs ( 1.7)
N == 16000 (10 runs)
aadict get: 40660.4 µs ( 2.0), set: 76056.4 µs ( 2.0)
dict get: 30382.6 µs ( 1.5), set: 115643.4 µs ( 3.1)
ets (set) get: 28500.2 µs ( 1.4), set: 37204.1 µs ( 1.0)
ets x10 (set) get: 35423.6 µs ( 1.7)
gb_trees get: 43823.2 µs ( 2.1), set: 88600.6 µs ( 2.4)
hasht get: 22523.7 µs ( 1.1), set: 2302517.3 µs ( 61.9)
hashtl get: 29762.8 µs ( 1.4), set: 41664.0 µs ( 1.1)
process dictionary get: 20756.7 µs ( 1.0), set: 45505.0 µs ( 1.2)
rbdict get: 40808.4 µs ( 2.0), set: 67970.2 µs ( 1.8)
trie get: 21091.4 µs ( 1.0), set: 83384.2 µs ( 2.2)
N == 32000 (10 runs)
aadict get: 100165.9 µs ( 2.5), set: 182365.3 µs ( 2.3)
dict get: 65931.7 µs ( 1.6), set: 281306.1 µs ( 3.6)
ets (set) get: 58365.9 µs ( 1.4), set: 86377.8 µs ( 1.1)
ets x10 (set) get: 70832.4 µs ( 1.7)
gb_trees get: 108996.9 µs ( 2.7), set: 201768.6 µs ( 2.6)
hasht get: 46683.5 µs ( 1.1), set: 7559194.8 µs ( 96.3)
hashtl get: 68377.4 µs ( 1.7), set: 78516.3 µs ( 1.0)
process dictionary get: 40873.3 µs ( 1.0), set: 90079.1 µs ( 1.1)
rbdict get: 101229.4 µs ( 2.5), set: 160677.5 µs ( 2.0)
trie get: 52120.7 µs ( 1.3), set: 205179.4 µs ( 2.6)
N == 64000 (10 runs)
aadict get: 236930.0 µs ( 5.3), set: 434665.3 µs ( 2.5)
dict get: 140327.2 µs ( 3.1), set: 955742.9 µs ( 5.4)
ets (set) get: 116628.2 µs ( 2.6), set: 177744.3 µs ( 1.0)
ets x10 (set) get: 136503.0 µs ( 3.0)
gb_trees get: 258394.8 µs ( 5.8), set: 475881.6 µs ( 2.7)
hasht get: 86721.3 µs ( 1.9), set: 40883420.3 µs (232.6)
hashtl get: 132184.1 µs ( 2.9), set: 189342.2 µs ( 1.1)
process dictionary get: 80194.3 µs ( 1.8), set: 175787.6 µs ( 1.0)
rbdict get: 236683.9 µs ( 5.3), set: 393917.9 µs ( 2.2)
trie get: 44896.4 µs ( 1.0), set: 406019.4 µs ( 2.3)
TEST integer_key
N == 50 (10 runs)
aadict get: 17.0 µs ( 3.2), set: 70.9 µs ( 7.2)
array (dynamic) get: 15.5 µs ( 2.9), set: 25.1 µs ( 2.5)
array (fixed) get: 15.5 µs ( 2.9), set: 27.3 µs ( 2.8)
dict get: 22.9 µs ( 4.3), set: 47.2 µs ( 4.8)
ets (set) get: 16.3 µs ( 3.1), set: 28.9 µs ( 2.9)
ets x10 (set) get: 20.4 µs ( 3.8)
gb_trees get: 16.6 µs ( 3.1), set: 99.9 µs ( 10.1)
hasht get: 11.5 µs ( 2.2), set: 117.8 µs ( 11.9)
hashtl get: 16.0 µs ( 3.0), set: 38.4 µs ( 3.9)
process dictionary get: 7.9 µs ( 1.5), set: 9.9 µs ( 1.0)
rbdict get: 16.0 µs ( 3.0), set: 43.2 µs ( 4.4)
tuple get: 5.3 µs ( 1.0), set: 12.1 µs ( 1.2)
N == 100 (10 runs)
aadict get: 42.4 µs ( 3.8), set: 127.8 µs ( 6.7)
array (dynamic) get: 35.7 µs ( 3.2), set: 59.6 µs ( 3.1)
array (fixed) get: 36.5 µs ( 3.3), set: 60.6 µs ( 3.2)
dict get: 58.0 µs ( 5.2), set: 137.0 µs ( 7.1)
ets (set) get: 39.4 µs ( 3.5), set: 70.6 µs ( 3.7)
ets x10 (set) get: 37.9 µs ( 3.4)
gb_trees get: 43.3 µs ( 3.9), set: 544.8 µs ( 28.4)
hasht get: 26.7 µs ( 2.4), set: 257.0 µs ( 13.4)
hashtl get: 40.5 µs ( 3.6), set: 102.5 µs ( 5.3)
process dictionary get: 18.7 µs ( 1.7), set: 19.2 µs ( 1.0)
rbdict get: 42.6 µs ( 3.8), set: 120.3 µs ( 6.3)
tuple get: 11.1 µs ( 1.0), set: 52.8 µs ( 2.8)
N == 250 (10 runs)
aadict get: 123.9 µs ( 3.8), set: 392.2 µs ( 8.6)
array (dynamic) get: 121.6 µs ( 3.7), set: 215.9 µs ( 4.8)
array (fixed) get: 121.9 µs ( 3.7), set: 378.6 µs ( 8.3)
dict get: 143.7 µs ( 4.4), set: 476.0 µs ( 10.5)
ets (set) get: 98.0 µs ( 3.0), set: 187.3 µs ( 4.1)
ets x10 (set) get: 73.5 µs ( 2.3)
gb_trees get: 124.0 µs ( 3.8), set: 977.7 µs ( 21.5)
hasht get: 68.0 µs ( 2.1), set: 646.4 µs ( 14.2)
hashtl get: 123.8 µs ( 3.8), set: 329.9 µs ( 7.3)
process dictionary get: 40.8 µs ( 1.3), set: 45.4 µs ( 1.0)
rbdict get: 124.2 µs ( 3.8), set: 365.7 µs ( 8.1)
tuple get: 32.6 µs ( 1.0), set: 273.5 µs ( 6.0)
N == 500 (10 runs)
aadict get: 272.6 µs ( 5.8), set: 918.0 µs ( 10.3)
array (dynamic) get: 208.8 µs ( 4.5), set: 380.0 µs ( 4.3)
array (fixed) get: 209.1 µs ( 4.5), set: 391.9 µs ( 4.4)
dict get: 267.8 µs ( 5.7), set: 1089.2 µs ( 12.3)
ets (set) get: 189.0 µs ( 4.0), set: 339.7 µs ( 3.8)
ets x10 (set) get: 123.6 µs ( 2.6)
gb_trees get: 256.7 µs ( 5.5), set: 2419.5 µs ( 27.2)
hasht get: 152.6 µs ( 3.3), set: 1541.5 µs ( 17.3)
hashtl get: 253.2 µs ( 5.4), set: 408.1 µs ( 4.6)
process dictionary get: 84.9 µs ( 1.8), set: 88.9 µs ( 1.0)
rbdict get: 256.2 µs ( 5.5), set: 840.5 µs ( 9.5)
tuple get: 46.9 µs ( 1.0), set: 2125.3 µs ( 23.9)
N == 1000 (10 runs)
aadict get: 478.9 µs ( 5.4), set: 1683.9 µs ( 11.0)
array (dynamic) get: 415.5 µs ( 4.7), set: 857.5 µs ( 5.6)
array (fixed) get: 453.3 µs ( 5.1), set: 888.8 µs ( 5.8)
dict get: 448.0 µs ( 5.0), set: 2231.5 µs ( 14.5)
ets (set) get: 334.7 µs ( 3.8), set: 1006.4 µs ( 6.6)
ets x10 (set) get: 203.2 µs ( 2.3)
gb_trees get: 477.3 µs ( 5.4), set: 4498.8 µs ( 29.3)
hasht get: 321.7 µs ( 3.6), set: 7766.1 µs ( 50.6)
hashtl get: 561.4 µs ( 6.3), set: 830.2 µs ( 5.4)
process dictionary get: 138.8 µs ( 1.6), set: 153.6 µs ( 1.0)
rbdict get: 488.3 µs ( 5.5), set: 1596.8 µs ( 10.4)
tuple get: 89.0 µs ( 1.0), set: 7330.6 µs ( 47.7)
N == 2000 (10 runs)
aadict get: 1096.2 µs ( 6.4), set: 3939.2 µs ( 13.9)
array (dynamic) get: 996.5 µs ( 5.8), set: 2026.3 µs ( 7.2)
array (fixed) get: 994.3 µs ( 5.8), set: 1856.3 µs ( 6.6)
dict get: 934.0 µs ( 5.5), set: 3885.4 µs ( 13.7)
ets (set) get: 714.5 µs ( 4.2), set: 2270.4 µs ( 8.0)
ets x10 (set) get: 390.3 µs ( 2.3)
gb_trees get: 1061.6 µs ( 6.2), set: 9302.3 µs ( 32.9)
hasht get: 559.6 µs ( 3.3), set: 26775.2 µs ( 94.6)
hashtl get: 1239.9 µs ( 7.3), set: 1985.6 µs ( 7.0)
process dictionary get: 268.6 µs ( 1.6), set: 283.1 µs ( 1.0)
rbdict get: 1083.6 µs ( 6.3), set: 3509.0 µs ( 12.4)
tuple get: 170.9 µs ( 1.0), set: 22928.5 µs ( 81.0)
N == 4000 (10 runs)
aadict get: 2598.7 µs ( 7.5), set: 9143.9 µs ( 15.4)
array (dynamic) get: 1991.0 µs ( 5.7), set: 3780.1 µs ( 6.3)
array (fixed) get: 1994.8 µs ( 5.8), set: 3619.8 µs ( 6.1)
dict get: 1959.2 µs ( 5.7), set: 10221.3 µs ( 17.2)
ets (set) get: 1450.3 µs ( 4.2), set: 4560.7 µs ( 7.7)
ets x10 (set) get: 766.1 µs ( 2.2)
gb_trees get: 2220.8 µs ( 6.4), set: 22137.9 µs ( 37.2)
hasht get: 1032.0 µs ( 3.0), set: 107334.7 µs (180.2)
hashtl get: 2263.7 µs ( 6.5), set: 3107.2 µs ( 5.2)
process dictionary get: 526.6 µs ( 1.5), set: 595.6 µs ( 1.0)
rbdict get: 2581.0 µs ( 7.4), set: 8322.6 µs ( 14.0)
tuple get: 346.5 µs ( 1.0), set: 91854.1 µs (154.2)
N == 8000 (10 runs)
aadict get: 5625.3 µs ( 8.6), set: 21650.5 µs ( 19.5)
array (dynamic) get: 3998.4 µs ( 6.1), set: 7457.9 µs ( 6.7)
array (fixed) get: 4014.6 µs ( 6.2), set: 7380.2 µs ( 6.7)
dict get: 4200.5 µs ( 6.5), set: 21589.1 µs ( 19.5)
ets (set) get: 2958.9 µs ( 4.5), set: 8535.7 µs ( 7.7)
ets x10 (set) get: 1509.6 µs ( 2.3)
gb_trees get: 4922.7 µs ( 7.6), set: 55256.5 µs ( 49.9)
hasht get: 1990.7 µs ( 3.1), set: 394366.0 µs (356.0)
hashtl get: 5715.1 µs ( 8.8), set: 6898.9 µs ( 6.2)
process dictionary get: 1057.2 µs ( 1.6), set: 1107.8 µs ( 1.0)
rbdict get: 5455.4 µs ( 8.4), set: 24543.2 µs ( 22.2)
tuple get: 650.9 µs ( 1.0), set: 365080.7 µs (329.6)
N == 16000 (10 runs)
aadict get: 11302.9 µs ( 8.7), set: 38483.2 µs ( 15.1)
array (dynamic) get: 9810.6 µs ( 7.6), set: 21888.0 µs ( 8.6)
array (fixed) get: 10283.3 µs ( 7.9), set: 21910.1 µs ( 8.6)
dict get: 8249.0 µs ( 6.4), set: 60084.0 µs ( 23.6)
ets (set) get: 5965.8 µs ( 4.6), set: 14583.2 µs ( 5.7)
ets x10 (set) get: 3052.2 µs ( 2.4)
gb_trees get: 10263.4 µs ( 7.9), set: 126358.0 µs ( 49.7)
hasht get: 3966.8 µs ( 3.1), set: 1570470.8 µs (617.4)
hashtl get: 11766.2 µs ( 9.1), set: 17786.2 µs ( 7.0)
process dictionary get: 2094.7 µs ( 1.6), set: 2543.8 µs ( 1.0)
rbdict get: 11263.8 µs ( 8.7), set: 36493.2 µs ( 14.3)
tuple get: 1294.3 µs ( 1.0), set: 1486397.5 µs (584.3)
N == 32000 (10 runs)
aadict get: 22952.3 µs ( 8.9), set: 85579.9 µs ( 14.6)
array (dynamic) get: 19565.7 µs ( 7.6), set: 40450.3 µs ( 6.9)
array (fixed) get: 19597.6 µs ( 7.6), set: 39994.8 µs ( 6.8)
dict get: 17543.6 µs ( 6.8), set: 185507.8 µs ( 31.7)
ets (set) get: 13456.0 µs ( 5.2), set: 29891.5 µs ( 5.1)
ets x10 (set) get: 6824.8 µs ( 2.6)
gb_trees get: 22242.6 µs ( 8.6), set: 244581.5 µs ( 41.8)
hasht get: 8159.9 µs ( 3.2), set: 6891457.9 µs (*****)
hashtl get: 24608.9 µs ( 9.5), set: 43185.2 µs ( 7.4)
process dictionary get: 4210.2 µs ( 1.6), set: 5853.7 µs ( 1.0)
rbdict get: 22806.8 µs ( 8.8), set: 84932.4 µs ( 14.5)
tuple get: 2581.2 µs ( 1.0), set: 6422056.3 µs (*****)
N == 64000 (10 runs)
aadict get: 47862.1 µs ( 9.4), set: 191716.5 µs ( 15.6)
array (dynamic) get: 39008.0 µs ( 7.7), set: 77311.6 µs ( 6.3)
array (fixed) get: 38962.4 µs ( 7.6), set: 78340.9 µs ( 6.4)
dict get: 38602.2 µs ( 7.6), set: 625891.7 µs ( 50.9)
ets (set) get: 35109.1 µs ( 6.9), set: 69833.4 µs ( 5.7)
ets x10 (set) get: 16434.1 µs ( 3.2)
gb_trees get: 46772.8 µs ( 9.2), set: 513581.4 µs ( 41.8)
hasht get: 18970.6 µs ( 3.7), set: 33729619.8 µs (*****)
hashtl get: 59980.1 µs ( 11.8), set: 108446.2 µs ( 8.8)
process dictionary get: 8468.0 µs ( 1.7), set: 12288.2 µs ( 1.0)
rbdict get: 48106.1 µs ( 9.4), set: 178132.4 µs ( 14.5)
tuple get: 5097.6 µs ( 1.0), set: 35494023.9 µs (*****)
|
 |
 |
 |
 |
|
 |
 |

 |
 |
 |
 |
 |
|
 |
 |
I created hasht, a simple hash table implementation that performs quick lookups. Setting the structure is quite slow because it always modifies a single large tuple. However, with a more tiered approach, the storage speed might increase. The hasht lookup time for strings is similar to the trie data structure, but is slower for smaller N. The hasht lookup time for integers is similar to ETS (unless the ETS table is accessed with greater than ~10 processes concurrently with read concurrency optimizations set on the table), but the process dictionary and tuple are both quicker for integer lookups. The benchmark results are from Linux version 2.6.32-21, Ubuntu 10.04 with an Intel(R) Core(TM)2 CPU T5600 @ 1.83GHz and non-HiPE Erlang R14B01:
TEST string_key
N == 1000 (10 runs)
aadict get: 828.6 µs ( 1.5), set: 2226.0 µs ( 1.4)
dict get: 940.9 µs ( 1.7), set: 3184.6 µs ( 2.0)
ets (set) get: 828.5 µs ( 1.5), set: 1586.9 µs ( 1.0)
ets x10 (set) get: 1030.8 µs ( 1.8)
gb_trees get: 897.6 µs ( 1.6), set: 2786.5 µs ( 1.8)
hasht get: 1392.5 µs ( 2.5), set: 9815.1 µs ( 6.2)
process dictionary get: 566.3 µs ( 1.0), set: 1618.2 µs ( 1.0)
rbdict get: 817.3 µs ( 1.4), set: 1806.3 µs ( 1.1)
trie get: 650.9 µs ( 1.1), set: 2466.6 µs ( 1.6)
N == 10000 (10 runs)
aadict get: 21320.8 µs ( 2.6), set: 45882.9 µs ( 2.1)
dict get: 17345.4 µs ( 2.1), set: 59625.4 µs ( 2.7)
ets (set) get: 16985.1 µs ( 2.0), set: 22305.2 µs ( 1.0)
ets x10 (set) get: 21002.8 µs ( 2.5)
gb_trees get: 23835.2 µs ( 2.9), set: 47572.9 µs ( 2.1)
hasht get: 13345.9 µs ( 1.6), set: 1030469.0 µs ( 46.2)
process dictionary get: 12837.2 µs ( 1.5), set: 25609.6 µs ( 1.1)
rbdict get: 21680.3 µs ( 2.6), set: 42978.7 µs ( 1.9)
trie get: 8351.5 µs ( 1.0), set: 35003.4 µs ( 1.6)
N == 50000 (10 runs)
aadict get: 179311.1 µs ( 2.9), set: 326961.5 µs ( 2.4)
dict get: 111803.5 µs ( 1.8), set: 615110.2 µs ( 4.6)
ets (set) get: 91941.2 µs ( 1.5), set: 135026.9 µs ( 1.0)
ets x10 (set) get: 107469.8 µs ( 1.7)
gb_trees get: 195219.1 µs ( 3.1), set: 358408.6 µs ( 2.7)
hasht get: 66421.6 µs ( 1.1), set: 27200459.7 µs (201.4)
process dictionary get: 62683.4 µs ( 1.0), set: 141821.2 µs ( 1.1)
rbdict get: 182102.3 µs ( 2.9), set: 294744.7 µs ( 2.2)
trie get: 71213.7 µs ( 1.1), set: 357100.4 µs ( 2.6)
TEST integer_key
N == 1000 (10 runs)
aadict get: 482.5 µs ( 5.6), set: 1917.5 µs ( 13.1)
array (dynamic) get: 393.3 µs ( 4.5), set: 721.6 µs ( 4.9)
array (fixed) get: 393.7 µs ( 4.6), set: 1080.2 µs ( 7.4)
dict get: 447.6 µs ( 5.2), set: 2010.5 µs ( 13.7)
ets (set) get: 333.3 µs ( 3.9), set: 748.7 µs ( 5.1)
ets x10 (set) get: 203.2 µs ( 2.3)
gb_trees get: 476.2 µs ( 5.5), set: 4339.8 µs ( 29.6)
hasht get: 358.1 µs ( 4.1), set: 7134.3 µs ( 48.7)
process dictionary get: 133.6 µs ( 1.5), set: 146.5 µs ( 1.0)
rbdict get: 478.0 µs ( 5.5), set: 2027.6 µs ( 13.8)
tuple get: 86.5 µs ( 1.0), set: 6903.4 µs ( 47.1)
N == 10000 (10 runs)
aadict get: 6670.9 µs ( 8.3), set: 24593.4 µs ( 14.3)
array (dynamic) get: 4968.5 µs ( 6.1), set: 9641.3 µs ( 5.6)
array (fixed) get: 5057.3 µs ( 6.3), set: 8929.1 µs ( 5.2)
dict get: 4857.0 µs ( 6.0), set: 29415.4 µs ( 17.1)
ets (set) get: 3695.3 µs ( 4.6), set: 8625.1 µs ( 5.0)
ets x10 (set) get: 1896.0 µs ( 2.3)
gb_trees get: 6234.5 µs ( 7.7), set: 72117.2 µs ( 42.0)
hasht get: 2386.1 µs ( 3.0), set: 1042563.7 µs (607.4)
process dictionary get: 1317.5 µs ( 1.6), set: 1716.3 µs ( 1.0)
rbdict get: 6772.1 µs ( 8.4), set: 25672.9 µs ( 15.0)
tuple get: 808.1 µs ( 1.0), set: 613975.0 µs (357.7)
N == 50000 (10 runs)
aadict get: 36984.9 µs ( 9.2), set: 166405.2 µs ( 16.8)
array (dynamic) get: 30181.9 µs ( 7.5), set: 58707.0 µs ( 5.9)
array (fixed) get: 30171.7 µs ( 7.5), set: 60833.8 µs ( 6.2)
dict get: 27988.1 µs ( 7.0), set: 408414.5 µs ( 41.3)
ets (set) get: 25076.6 µs ( 6.3), set: 50396.9 µs ( 5.1)
ets x10 (set) get: 12081.4 µs ( 3.0)
gb_trees get: 35874.3 µs ( 9.0), set: 405197.7 µs ( 41.0)
hasht get: 13516.8 µs ( 3.4), set: 26560861.7 µs (*****)
process dictionary get: 6687.2 µs ( 1.7), set: 9888.3 µs ( 1.0)
rbdict get: 36388.4 µs ( 9.1), set: 152543.6 µs ( 15.4)
tuple get: 4002.1 µs ( 1.0), set: 18081650.6 µs (*****)
|
 |
 |
 |
 |
|
 |
 |

 |
 |
 |
 |
 |
|
 |
 |
My benchmarks for key/value data stored with either integer or string keys lacked explanation of the results. I will explain my conclusions here based on the benchmark results. When looking at the benchmark results, the normalized time spent during the test is within the parentheses, next to the time it relates to. The normalized values shows how many times an operation for a data structure is slower than the fastest alternative. The orddict data structure is the fastest data structure for storing data with integer keys. However, all other operations on an orddict are very slow compared to any of the other data structures. Storing data with an integer key is slowest in a tuple, but the integer key lookup of data within a tuple is the fastest among all the data structures. The process dictionary is the fastest for storing data if you ignore an orddict with an integer key (since a lookup in an orddict is painfully slow). Typically, usage of the process dictionary is discouraged, since it eliminates the source code's referential transparency making the code harder to test, maintain, and understand. ETS results in the benchmark showed that concurrent lookups are faster with integer keys but slower with string keys, which shows that more complex comparison operations limit scalability quicker with ETS' global data storage (the assumption being that concurrent integer key lookups on ETS would also become slower, but it would take more concurrent lookups to see the performance degrade due to the global data locking that is used for consistency). A persistent puzzle has been the aadict and gb_trees which should implement the same algorithm, but differ for storing integer key data (the aadict is quicker). The aadict and gb_trees data structures should have faster lookup times than the rbdict data structure, but no significant difference can be seen (gb_trees actually has slower lookup times for string keys). The dict data structure is faster for lookups but slower for storing when compared with the red-black tree data structures (rbdict, aatree and gb_trees). The array data structure is much quicker than a tuple for storing integer key data, if you can tolerate the slower lookup time (lookups become 7.5 times slower than a tuple). The trie data structure is a new data structure I created for efficient string lookups. The trie is able to be roughly equivalent to the process dictionary for lookups (1.1 times the speed for the largest test), but should always be slightly slower as an externally implemented data structure (i.e., not a core language feature like the process dictionary). The store of string key data in a trie is roughly equivalent to storing data in a red-black tree data structure (a trie is slower than rbdict, equivalent to aatree, and faster than gb_trees for the largest test). The speed of the trie comes from exploiting the quick integer key lookups on a tuple. The speed of storing string key data in the red-black tree data structures is similar to storing data in a trie, because all of these tree structures use tuples for nodes. One interesting fact is that the trie can not store the atom value 'error' because of the ambiguity within the dict interface (the dict:fetch/2 function) and the fact that the atom 'error' is used for empty values within the trie to make the implementation simpler.
|
 |
 |
 |
 |
|
 |
 |




|
 |
|
 |