?

Log in

Michael Truog
My benchmarks for key/value data stored with either integer or string keys lacked explanation of the results. I will explain my conclusions here based on the benchmark results. When looking at the benchmark results, the normalized time spent during the test is within the parentheses, next to the time it relates to. The normalized values shows how many times an operation for a data structure is slower than the fastest alternative.

The orddict data structure is the fastest data structure for storing data with integer keys. However, all other operations on an orddict are very slow compared to any of the other data structures. Storing data with an integer key is slowest in a tuple, but the integer key lookup of data within a tuple is the fastest among all the data structures. The process dictionary is the fastest for storing data if you ignore an orddict with an integer key (since a lookup in an orddict is painfully slow). Typically, usage of the process dictionary is discouraged, since it eliminates the source code's referential transparency making the code harder to test, maintain, and understand.

ETS results in the benchmark showed that concurrent lookups are faster with integer keys but slower with string keys, which shows that more complex comparison operations limit scalability quicker with ETS' global data storage (the assumption being that concurrent integer key lookups on ETS would also become slower, but it would take more concurrent lookups to see the performance degrade due to the global data locking that is used for consistency). A persistent puzzle has been the aadict and gb_trees which should implement the same algorithm, but differ for storing integer key data (the aadict is quicker). The aadict and gb_trees data structures should have faster lookup times than the rbdict data structure, but no significant difference can be seen (gb_trees actually has slower lookup times for string keys). The dict data structure is faster for lookups but slower for storing when compared with the red-black tree data structures (rbdict, aatree and gb_trees).

The array data structure is much quicker than a tuple for storing integer key data, if you can tolerate the slower lookup time (lookups become 7.5 times slower than a tuple). The trie data structure is a new data structure I created for efficient string lookups. The trie is able to be roughly equivalent to the process dictionary for lookups (1.1 times the speed for the largest test), but should always be slightly slower as an externally implemented data structure (i.e., not a core language feature like the process dictionary). The store of string key data in a trie is roughly equivalent to storing data in a red-black tree data structure (a trie is slower than rbdict, equivalent to aatree, and faster than gb_trees for the largest test).

The speed of the trie comes from exploiting the quick integer key lookups on a tuple. The speed of storing string key data in the red-black tree data structures is similar to storing data in a trie, because all of these tree structures use tuples for nodes. One interesting fact is that the trie can not store the atom value 'error' because of the ambiguity within the dict interface (the dict:fetch/2 function) and the fact that the atom 'error' is used for empty values within the trie to make the implementation simpler.
Leave a comment
Writing an Erlang in-order list traversal to form a new list is very common, but it is not always clear what traversal method to use. There is a brief benchmark (added to erlbench) which shows the best method depends on the size of the list. The benchmark was ran (with R14B01, without HiPE) on an AMD Phenom 9950 Quad-Core (64 bit) running Linux 2.6.32-23-generic (Ubuntu).

TEST list_traversal
N == 1000 (10 runs)
       list -> queue get:      245.4 µs (  7.9)
  list comprehension get:       31.2 µs (  1.0)
       lists:foldr/3 get:       54.4 µs (  1.7)
         lists:map/2 get:       59.4 µs (  1.9)
      queue traverse get:      193.4 µs (  6.2)
    reverse/traverse get:       53.8 µs (  1.7)
     traverse/append get:     3070.1 µs ( 98.4)
N == 10000 (10 runs)
       list -> queue get:     3607.6 µs (  8.0)
  list comprehension get:      449.2 µs (  1.0)
       lists:foldr/3 get:      993.5 µs (  2.2)
         lists:map/2 get:      681.8 µs (  1.5)
      queue traverse get:     3062.9 µs (  6.8)
    reverse/traverse get:      929.5 µs (  2.1)
     traverse/append get:   333260.8 µs (741.9)
N == 50000 (10 runs)
       list -> queue get:    19467.0 µs (  7.8)
  list comprehension get:     2505.1 µs (  1.0)
       lists:foldr/3 get:     2813.3 µs (  1.1)
         lists:map/2 get:     2844.0 µs (  1.1)
      queue traverse get:    15101.6 µs (  6.0)
    reverse/traverse get:     5721.9 µs (  2.3)
     traverse/append get: 13025345.9 µs (*****)


The results show that any traversal that keeps the list size the same should use list comprehension. A traversal that must alter the size of the list could use lists:foldr/3 with the [H|T] syntax, but the usage is not tail-recursive. A tail-recursive solution must use the reverse/traverse method for slightly slower performance (the choice is described in the Erlang Efficiency Guide). Robert Virding mentioned that the reverse/traverse method should be preferred since it is tail-recursive and can keep memory consumption low when processing large lists. This test currently only tests the reverse/traverse method where the reverse is performed before the traversal, since performing the reversal after the traversal should have very similar characteristics.

The important point to remember from the benchmark is to not use the "traverse/append" method which does an in-order traversal that relies on ++[Element] syntax to append an element to the new list (newly created each append operation). Not using "traverse/append" is somewhat contrary to the Erlang efficiency guide, that states '++' is not slow. However, with the other options available, it is easy to ignore.

Thanks goes to Robert Virding and Andrew Thompson for helping with suggestions for improving the benchmark.
Leave a comment
I added a way to automate all the interesting benchmarks for the Erlang data structures (including a trie). Below are the results for non-HiPE R14B01 on Linux version 2.6.32-21, Ubuntu 10.04 with an Intel(R) Core(TM)2 CPU T5600 @ 1.83GHz.

TEST string_key
N == 1000 (10 runs)
              aadict get:      812.0 µs (  1.3), set:     2207.6 µs (  1.4)
                dict get:     1051.2 µs (  1.7), set:     3267.3 µs (  2.0)
           ets (set) get:      888.7 µs (  1.5), set:     1611.8 µs (  1.0)
       ets x10 (set) get:     1092.5 µs (  1.8)
            gb_trees get:      938.1 µs (  1.6), set:     2803.5 µs (  1.7)
             orddict get:    25313.7 µs ( 41.9), set:    22926.3 µs ( 14.2)
  process dictionary get:      604.2 µs (  1.0), set:     1623.2 µs (  1.0)
              rbdict get:      801.3 µs (  1.3), set:     1769.9 µs (  1.1)
                trie get:      711.0 µs (  1.2), set:     2528.4 µs (  1.6)
N == 10000 (10 runs)
              aadict get:    22826.8 µs (  2.6), set:    44371.5 µs (  1.8)
                dict get:    18529.8 µs (  2.1), set:    61161.0 µs (  2.5)
           ets (set) get:    18280.8 µs (  2.1), set:    24070.6 µs (  1.0)
       ets x10 (set) get:    22830.2 µs (  2.6)
            gb_trees get:    24896.9 µs (  2.8), set:    50116.9 µs (  2.1)
             orddict get:  2674456.1 µs (302.2), set:  2442526.4 µs (101.5)
  process dictionary get:    13625.8 µs (  1.5), set:    26584.9 µs (  1.1)
              rbdict get:    22842.5 µs (  2.6), set:    39537.7 µs (  1.6)
                trie get:     8848.6 µs (  1.0), set:    35448.0 µs (  1.5)
N == 50000 (10 runs)
              aadict get:   177210.4 µs (  2.8), set:   350989.2 µs (  2.5)
                dict get:   107799.7 µs (  1.7), set:   638809.5 µs (  4.6)
           ets (set) get:    92473.9 µs (  1.5), set:   139183.1 µs (  1.0)
       ets x10 (set) get:   108796.4 µs (  1.7)
            gb_trees get:   194868.6 µs (  3.1), set:   374895.5 µs (  2.7)
             orddict get: 80504982.5 µs (*****), set: 68700235.0 µs (493.6)
  process dictionary get:    63323.8 µs (  1.0), set:   141767.2 µs (  1.0)
              rbdict get:   182692.8 µs (  2.9), set:   288624.4 µs (  2.1)
                trie get:    71009.8 µs (  1.1), set:   354443.1 µs (  2.5)
TEST integer_key
N == 1000 (10 runs)
              aadict get:      481.9 µs (  5.5), set:     2202.4 µs ( 16.9)
     array (dynamic) get:      379.6 µs (  4.3), set:      805.8 µs (  6.2)
       array (fixed) get:      384.7 µs (  4.4), set:      850.3 µs (  6.5)
                dict get:      459.1 µs (  5.2), set:     2122.6 µs ( 16.3)
           ets (set) get:      337.0 µs (  3.8), set:      611.3 µs (  4.7)
       ets x10 (set) get:      203.9 µs (  2.3)
            gb_trees get:      489.8 µs (  5.5), set:     4817.7 µs ( 37.1)
             orddict get:    16415.9 µs (185.9), set:      130.0 µs (  1.0)
  process dictionary get:      131.8 µs (  1.5), set:      157.0 µs (  1.2)
              rbdict get:      483.5 µs (  5.5), set:     2270.6 µs ( 17.5)
               tuple get:       88.3 µs (  1.0), set:    11688.3 µs ( 89.9)
N == 10000 (10 runs)
              aadict get:     7085.1 µs (  8.2), set:    23029.4 µs ( 18.2)
     array (dynamic) get:     5056.6 µs (  5.9), set:    12024.1 µs (  9.5)
       array (fixed) get:     5086.3 µs (  5.9), set:    11836.4 µs (  9.4)
                dict get:     5362.6 µs (  6.2), set:    32728.6 µs ( 25.9)
           ets (set) get:     3768.3 µs (  4.4), set:     7688.5 µs (  6.1)
       ets x10 (set) get:     1984.2 µs (  2.3)
            gb_trees get:     6315.3 µs (  7.3), set:    62973.9 µs ( 49.8)
             orddict get:  1651408.1 µs (*****), set:     1264.6 µs (  1.0)
  process dictionary get:     1329.4 µs (  1.5), set:     1614.8 µs (  1.3)
              rbdict get:     7058.8 µs (  8.2), set:    21787.7 µs ( 17.2)
               tuple get:      861.5 µs (  1.0), set:  1023597.5 µs (809.4)
N == 50000 (10 runs)
              aadict get:    37768.6 µs (  9.3), set:   163942.9 µs ( 27.2)
     array (dynamic) get:    30625.4 µs (  7.6), set:    67480.3 µs ( 11.2)
       array (fixed) get:    30555.5 µs (  7.5), set:    83706.3 µs ( 13.9)
                dict get:    29288.6 µs (  7.2), set:   435990.0 µs ( 72.2)
           ets (set) get:    26412.3 µs (  6.5), set:    55838.4 µs (  9.2)
       ets x10 (set) get:    12812.3 µs (  3.2)
            gb_trees get:    36219.4 µs (  8.9), set:   396159.8 µs ( 65.6)
             orddict get: 42714046.1 µs (*****), set:     6037.3 µs (  1.0)
  process dictionary get:     6583.2 µs (  1.6), set:     9550.8 µs (  1.6)
              rbdict get:    37226.9 µs (  9.2), set:   146739.7 µs ( 24.3)
               tuple get:     4052.0 µs (  1.0), set: 19153655.7 µs (*****)


HiPE results:
TEST string_key
N == 1000 (10 runs)
              aadict get:      839.5 µs (  1.5), set:     2558.9 µs (  1.6)
                dict get:      946.7 µs (  1.7), set:     3233.6 µs (  2.0)
           ets (set) get:      836.5 µs (  1.5), set:     1654.7 µs (  1.0)
       ets x10 (set) get:      914.8 µs (  1.6)
            gb_trees get:      945.8 µs (  1.7), set:     3183.7 µs (  2.0)
             orddict get:    23305.9 µs ( 40.7), set:    23718.0 µs ( 15.0)
  process dictionary get:      573.2 µs (  1.0), set:     1582.6 µs (  1.0)
              rbdict get:      831.6 µs (  1.5), set:     2134.6 µs (  1.3)
                trie get:      652.9 µs (  1.1), set:     2379.5 µs (  1.5)
N == 10000 (10 runs)
              aadict get:    21463.1 µs (  2.4), set:    43555.5 µs (  2.0)
                dict get:    18086.4 µs (  2.1), set:    60505.8 µs (  2.7)
           ets (set) get:    17702.1 µs (  2.0), set:    22067.5 µs (  1.0)
       ets x10 (set) get:    21480.2 µs (  2.4)
            gb_trees get:    22833.0 µs (  2.6), set:    50190.2 µs (  2.3)
             orddict get:  2467552.8 µs (280.5), set:  2334416.3 µs (105.8)
  process dictionary get:    13326.6 µs (  1.5), set:    26943.1 µs (  1.2)
              rbdict get:    21257.1 µs (  2.4), set:    36856.8 µs (  1.7)
                trie get:     8796.2 µs (  1.0), set:    35143.5 µs (  1.6)
N == 50000 (10 runs)
              aadict get:   178153.0 µs (  2.8), set:   328912.0 µs (  2.4)
                dict get:   108123.8 µs (  1.7), set:   641260.2 µs (  4.6)
           ets (set) get:    95263.1 µs (  1.5), set:   138832.9 µs (  1.0)
       ets x10 (set) get:   109103.1 µs (  1.7)
            gb_trees get:   195767.6 µs (  3.1), set:   372523.4 µs (  2.7)
             orddict get: 71658356.8 µs (*****), set: 66819443.2 µs (481.3)
  process dictionary get:    63528.8 µs (  1.0), set:   141625.8 µs (  1.0)
              rbdict get:   179620.9 µs (  2.8), set:   289155.1 µs (  2.1)
                trie get:    69662.6 µs (  1.1), set:   306391.0 µs (  2.2)
TEST integer_key
N == 1000 (10 runs)
              aadict get:      490.9 µs (  5.6), set:     2064.5 µs ( 14.4)
     array (dynamic) get:      379.1 µs (  4.3), set:    10328.3 µs ( 72.1)
       array (fixed) get:      387.9 µs (  4.4), set:      859.7 µs (  6.0)
                dict get:      473.7 µs (  5.4), set:     2312.0 µs ( 16.1)
           ets (set) get:      324.6 µs (  3.7), set:      572.9 µs (  4.0)
       ets x10 (set) get:      253.0 µs (  2.9)
            gb_trees get:     1501.1 µs ( 17.0), set:     5331.8 µs ( 37.2)
             orddict get:    16627.7 µs (188.1), set:      143.3 µs (  1.0)
  process dictionary get:      130.6 µs (  1.5), set:      165.4 µs (  1.2)
              rbdict get:      484.5 µs (  5.5), set:     1858.8 µs ( 13.0)
               tuple get:       88.4 µs (  1.0), set:    10606.2 µs ( 74.0)
N == 10000 (10 runs)
              aadict get:     7629.1 µs (  9.5), set:    22593.4 µs ( 19.4)
     array (dynamic) get:     5058.1 µs (  6.3), set:    13388.1 µs ( 11.5)
       array (fixed) get:     5442.5 µs (  6.8), set:    12669.2 µs ( 10.9)
                dict get:     5522.5 µs (  6.9), set:    34435.3 µs ( 29.5)
           ets (set) get:     3692.8 µs (  4.6), set:     7651.2 µs (  6.6)
       ets x10 (set) get:     2204.8 µs (  2.7)
            gb_trees get:     6485.1 µs (  8.1), set:    66586.3 µs ( 57.0)
             orddict get:  1635810.3 µs (*****), set:     1167.3 µs (  1.0)
  process dictionary get:     1301.4 µs (  1.6), set:     1923.7 µs (  1.6)
              rbdict get:     7530.5 µs (  9.4), set:    20872.8 µs ( 17.9)
               tuple get:      803.8 µs (  1.0), set:  1025256.6 µs (878.3)
N == 50000 (10 runs)
              aadict get:    38563.9 µs (  9.6), set:   147089.5 µs ( 24.1)
     array (dynamic) get:    34494.8 µs (  8.6), set:    84619.5 µs ( 13.9)
       array (fixed) get:    31674.2 µs (  7.9), set:    86212.3 µs ( 14.1)
                dict get:    31817.1 µs (  7.9), set:   549230.6 µs ( 90.0)
           ets (set) get:    26095.5 µs (  6.5), set:    56195.0 µs (  9.2)
       ets x10 (set) get:    14836.9 µs (  3.7)
            gb_trees get:    36555.7 µs (  9.1), set:   397905.1 µs ( 65.2)
             orddict get: 42921037.8 µs (*****), set:     6101.3 µs (  1.0)
  process dictionary get:     6665.2 µs (  1.7), set:    11555.6 µs (  1.9)
              rbdict get:    37324.8 µs (  9.3), set:   137461.0 µs ( 22.5)
               tuple get:     4025.4 µs (  1.0), set: 27603206.1 µs (*****)
Leave a comment
I now have a more thorough Erlang data structure benchmark for storing data with either integer keys or string (list of integers) keys. The results show that the new trie data structure is a good choice for the storage of data with string keys (i.e., obvious typical usage). The performance of ets can change with the amount of concurrent accesses, since ets uses global state (as shown below with the 10 concurrent ets reads). Example (non-HiPE) results are below:

Integer Key:
N ==    10000
array (fixed):    get:     5065 µs, set:     9076 µs
array (dynamic):  get:     5040 µs, set:     9209 µs
tuple:            get:      820 µs, set:   249976 µs
gb_trees:         get:     6825 µs, set:    68632 µs
rbdict:           get:     6251 µs, set:    20912 µs
aadict:           get:     6397 µs, set:    24892 µs
orddict:          get:  1646868 µs, set:     2261 µs
dict:             get:     4541 µs, set:    26161 µs
ets (set):        get:     3972 µs, set:     7718 µs
process dict:     get:     1092 µs, set:     2904 µs
ets x10 (set):    get:     1943 µs

String Key:
N ==    10000
gb_trees:         get:    24420 µs, set:    47523 µs
rbdict:           get:    22337 µs, set:    39466 µs
aadict:           get:    22217 µs, set:    46358 µs
orddict:          get:  2694740 µs, set:  2470911 µs
dict:             get:    18034 µs, set:    63578 µs
trie:             get:    14826 µs, set:    47240 µs
ets (set):        get:    17645 µs, set:    23810 µs
process dict:     get:    13390 µs, set:    26796 µs
ets x10 (set):    get:    21102 µs


The benchmark results are from Linux version 2.6.32-21, Ubuntu 10.04 with an Intel(R) Core(TM)2 CPU T5600 @ 1.83GHz.
Leave a comment
I created a trie implementation that can outperform the standard Erlang dict implementation (20% faster lookups) while providing the same dict module interface functions. A trie data structure is a form of radix sort invented by Edward Fredkin. My implementation is a PATRICIA trie, since it stores the key suffix on a leaf node (PATRICIA - Practical Algorithm to Retrieve Information Coded in Alphanumeric, D.R.Morrison (1968)), rather than creating extra deeply nested nodes with only single children nodes. The trie implementation also can provide a way of implementing Burstsort which is quickest for sorting strings, though the performance might degrade due to Erlang being a functional language (since mutable state will not be cache-efficient like it would otherwise be in an imperative language).

The trie implementation does have both the "foldl" function and the "foldr" function for alphabetical traversals of the trie. The "map" function currently does a traversal in reverse alphabetical order. The changes to the trie data structure are likely to be slow, since the trie structure relies on tuples (setting a tuple is slow). The lookups on a trie data structure are regarded as the most important operations.

My test of the trie implementation lookup speed relied on the locally installed wordlist (98569 words that have an average of 9.5 characters). The same operations were performed on the trie data structure and the dict data structure, 10 times for both the HiPE installation and the non-HiPE. Unfortunately, the HiPE results were slower than the non-HiPE runs, possibly because my usage of integers is an uncommon case. For 10 HiPE runs, the trie took 541655.2 µs and the dict took 623698.5 µs for a 1.15 speedup factor and a 13.2 % improvement. For 10 non-HiPE runs, the trie took 505989.0 µs and the dict took 635995.5 µs for a 1.26 speedup factor and a 20.4 % improvement. The wordlist was fully stored and then fully accessed, using "is_key", "fetch", and "find". The benchmark results (from the test_speed function) are from Linux version 2.6.32-21, Ubuntu 10.04 with an Intel(R) Core(TM)2 CPU T5600 @ 1.83GHz.

(Updated the HiPE results on 1/2/2011 after a proper Erlang compilation, i.e., when --enable-hipe fails on configure, there is no error)
1 comment or Leave a comment
I updated the Erlang index-based access benchmark to see how performance is with R14B01 (along with rbdict and aadict). The test was ran 5 times for HiPE and 5 times for non-HiPE installations with the following results:

HiPE:
array (fixed):    get:    5276.8 µs, set:    9489.4 µs
array (dynamic):  get:    5172.0 µs, set:    9616.6 µs
tuple:            get:     988.2 µs, set:  246523.8 µs
gb_trees:         get:    7245.0 µs, set:   67204.2 µs
rbdict:           get:    6379.8 µs, set:   21416.0 µs
aadict:           get:    6438.0 µs, set:   28434.4 µs
orddict:          get: 1632134.8 µs, set:    2214.6 µs
dict:             get:    4980.2 µs, set:   30754.8 µs

(normalized)
array (fixed):    get:       5.3, set:       4.3
array (dynamic):  get:       5.2, set:       4.3
tuple:            get:       1.0, set:     111.3
gb_trees:         get:       7.3, set:      30.3
rbdict:           get:       6.5, set:       9.7
aadict:           get:       6.5, set:      12.8
orddict:          get:    1651.6, set:       1.0
dict:             get:       5.0, set:      13.9

non-HiPE:
array (fixed):    get:    5188.4 µs, set:    9547.2 µs
array (dynamic):  get:    5164.6 µs, set:    9728.4 µs
tuple:            get:     993.4 µs, set:  256346.2 µs
gb_trees:         get:    6498.6 µs, set:   64902.6 µs
rbdict:           get:    6616.4 µs, set:   21696.6 µs
aadict:           get:    6548.4 µs, set:   27730.0 µs
orddict:          get: 1661093.2 µs, set:    2117.4 µs
dict:             get:    4781.4 µs, set:   30349.0 µs

(normalized)
array (fixed):    get:       5.2, set:       4.5
array (dynamic):  get:       5.2, set:       4.6
tuple:            get:       1.0, set:     121.1
gb_trees:         get:       6.5, set:      30.7
rbdict:           get:       6.7, set:      10.2
aadict:           get:       6.6, set:      13.1
orddict:          get:    1672.1, set:       1.0
dict:             get:       4.8, set:      14.3

The HiPE and non-HiPE results are both very similar. The rbdict, aadict, and gb_trees data structures do have very similar "get" times but the gb_trees data structure "set" time is still a bit slow (compared to rbdict and aadict). The dict implementation has the best "get" times for a tree structure but the "set" times are still a little slower than rbdict and aadict.

If "get" usage dominates, index-based access is always fastest on a tuple (so fixed size array usage might be questionable, unless it involves much "set" usage).

The benchmark results are from Linux version 2.6.32-21, Ubuntu 10.04 with an Intel(R) Core(TM)2 CPU T5600 @ 1.83GHz. My old results are here.

(Updated the HiPE results on 1/2/2011 after a proper Erlang compilation, i.e., when --enable-hipe fails on configure, there is no error)
Leave a comment
The Erlang User Conference went rather quickly. I presented my work on Cloudi, which is a private cloud computing framework written mainly in Erlang. The application of a private cloud seems to be a bit intimidating and specialized. The idea of Cloudi as a controller in a model-view-controller architecture, where the model exists in the database and the view layer is a thin layer accessing the database, provided some context for the talk. The Cloudi project is unique among the open-source cloud computing projects, so it might take awhile before it finds a userbase.

I also found out about a new project at Uppsala University that is attempting a subset of the Cloudi functionality, but with a different aim. The Low Power Erlang-Based Cluster (LoPEC) project is focused on low power computation using OpenCL integration. So, it will be interesting to see how their end result might be influenced by Cloudi.

One point that might be an important distinction is that the LoPEC project seems to be concerned about internode communication of work data (based on conference discussions). The Cloudi framework enforces an interface that does not include internode communication (inter-work-task communication, i.e., internode in this context does not relate to Erlang nodes but rather computational nodes which are worker threads in Cloudi) so that work parallelism is preserved and maintained (at the cost of only supporting "embarrassingly parallel" and "divide and conquer" algorithms). The Cloudi approach is part of a hope that divide and conquer algorithms can be created for work (using Functional Decomposition) that would otherwise rely on simplistic internode communication that prevents parallelism. Such divide and conquer algorithms are not trivial, but should be a goal for efficiency. It is possible that MPI or a similar internode communication method could be used to communicate work data within Cloudi work but that direction would be part of a conscious decision to bypass the framework.

Current Mood: happy happy

Leave a comment
I decided to modify a pre-existing benchmark to include a few dictionary implementations. The newest dictionary implementation is rbdict and is not yet included in the Erlang distribution. I ran the benchmark using the latest version (R12B-5 without HiPE) on an AMD Athlon(tm) 64 Processor 3000+ running Linux 2.6.20 as x86_64 to get the results below:

array (fixed):    get:     8164µs, set:    14293µs
array (dynamic):  get:     8334µs, set:    14421µs
tuple:            get:     1472µs, set:  1603101µs
gb_trees:         get:   113114µs, set:   212955µs
rbdict:           get:     8523µs, set:    36555µs
orddict:          get:  4677552µs, set:     3860µs
dict:             get:    11223µs, set:   153662µs


The results show that rbdict is worth getting if you need an efficient dictionary implementation in Erlang.

I found a benchmark done by Robert Virding with random input. The input to the benchmark above was not random.
Leave a comment
I took pictures in some of the museums and areas in Stockholm, Sweden as I did my sightseeing.

The ABBA Museum has not been opened yet... maybe next time.

Current Mood: happy happy

Leave a comment
Getting ready for going to Stockholm for the Erlang User Conference. There is a lot to see and I don't think I will be going to the The ABBA Museum... well... maybe.

The conference is at Ericsson's conference center. Ericsson is known for their AXD 301 system, using 1.7 million lines of Erlang to achieve 99.9999999% uptime (31 ms of downtime a year). In addition to high availability, Erlang is known for its succinctness. Erlang code can do the same job as C++ or Java in 1/3rd to 1/18th the amount of code depending on the task. Often Erlang is faster, as long as it is a concurrent messaging task, instead of computational_floating_point/file_intensive/regular_expression tasks. If Erlang is not the correct choice for a task, it can always be integrated as a separate node, to ensure fault tolerance in the system.

Source Lines Of Code (SLOC) is the simplest way to evaluate system complexity (not necessarily the "correct" way) and has been used with the Capability Maturity Model (CMM) to determine an understanding of the cost of software development when using a typical waterfall development methodology based on COnstructive COst MOdel (COCOMO) (which was developed in 1981).

From a pragmatic point of view, less SLOC in a critical backend system reduces development time and makes the system easier to maintain. The resulting system is easier to understand and is more likely to have a longer lifespan.
Leave a comment