?

Log in

Erlang Hash Table Layered Implementation - Michael Truog
okeuday
Erlang Hash Table Layered Implementation
After seeing the slow storage speed on my simple Erlang hash table implementation (hasht), I attempted to create a faster data structure. With smaller nested tuples that are smaller segments of a hash table, my new hashtl (Hash Table Layered) Erlang data structure does provide faster storage than hasht. However, hashtl has slightly slower fetch times that are similar to dict fetch times (dict uses a 2 layered hash table that is dynamically sized). The hashtl performance is roughly opposite the trie data structure, since the trie has quick "get" times on string keys, while the hashtl has quick "set" times on string keys (i.e., the ratios in the parentheses for the opposite operations are roughly equivalent between the two data structures, when only considering string keys). The benchmark results are from Linux version 2.6.32-21, Ubuntu 10.04 with an Intel(R) Core(TM)2 CPU T5600 @ 1.83GHz and non-HiPE Erlang R14B01:
TEST string_key
N == 50 (10 runs)
              aadict get:       19.0 µs (  1.1), set:       58.5 µs (  1.2)
                dict get:       32.5 µs (  1.9), set:      118.4 µs (  2.4)
           ets (set) get:       32.4 µs (  1.9), set:       54.7 µs (  1.1)
       ets x10 (set) get:       45.2 µs (  2.6)
            gb_trees get:       21.0 µs (  1.2), set:       92.4 µs (  1.9)
               hasht get:       21.6 µs (  1.3), set:      177.4 µs (  3.6)
              hashtl get:       31.5 µs (  1.8), set:       54.0 µs (  1.1)
  process dictionary get:       18.8 µs (  1.1), set:       53.1 µs (  1.1)
              rbdict get:       19.9 µs (  1.2), set:       49.4 µs (  1.0)
                trie get:       17.1 µs (  1.0), set:      105.6 µs (  2.1)
N == 100 (10 runs)
              aadict get:       44.3 µs (  1.3), set:      129.2 µs (  1.3)
                dict get:       75.2 µs (  2.1), set:      235.2 µs (  2.4)
           ets (set) get:       62.1 µs (  1.8), set:      103.8 µs (  1.1)
       ets x10 (set) get:      158.0 µs (  4.5)
            gb_trees get:       49.8 µs (  1.4), set:      188.3 µs (  1.9)
               hasht get:       42.7 µs (  1.2), set:      269.7 µs (  2.8)
              hashtl get:       56.2 µs (  1.6), set:       97.9 µs (  1.0)
  process dictionary get:       38.1 µs (  1.1), set:      112.4 µs (  1.1)
              rbdict get:       45.1 µs (  1.3), set:      107.4 µs (  1.1)
                trie get:       35.2 µs (  1.0), set:      198.5 µs (  2.0)
N == 250 (10 runs)
              aadict get:      138.8 µs (  1.5), set:      400.4 µs (  2.0)
                dict get:      175.6 µs (  1.9), set:      682.3 µs (  3.4)
           ets (set) get:      149.5 µs (  1.6), set:      255.2 µs (  1.3)
       ets x10 (set) get:      259.8 µs (  2.8)
            gb_trees get:      154.2 µs (  1.7), set:      530.6 µs (  2.6)
               hasht get:      131.7 µs (  1.4), set:      793.0 µs (  3.9)
              hashtl get:      167.7 µs (  1.8), set:      201.1 µs (  1.0)
  process dictionary get:       93.3 µs (  1.0), set:      340.0 µs (  1.7)
              rbdict get:      139.0 µs (  1.5), set:      338.8 µs (  1.7)
                trie get:       96.4 µs (  1.0), set:      510.3 µs (  2.5)
N == 500 (10 runs)
              aadict get:      340.8 µs (  1.8), set:      999.1 µs (  2.0)
                dict get:      372.0 µs (  2.0), set:     1555.1 µs (  3.1)
           ets (set) get:      307.4 µs (  1.6), set:      614.2 µs (  1.2)
       ets x10 (set) get:      461.6 µs (  2.5)
            gb_trees get:      359.9 µs (  1.9), set:     1358.1 µs (  2.7)
               hasht get:      472.9 µs (  2.5), set:     1968.7 µs (  3.9)
              hashtl get:      465.1 µs (  2.5), set:      506.7 µs (  1.0)
  process dictionary get:      187.6 µs (  1.0), set:      578.7 µs (  1.1)
              rbdict get:      327.2 µs (  1.7), set:      810.9 µs (  1.6)
                trie get:      214.7 µs (  1.1), set:     1148.8 µs (  2.3)
N == 1000 (10 runs)
              aadict get:      806.1 µs (  1.3), set:     2299.6 µs (  1.6)
                dict get:      931.0 µs (  1.5), set:     3238.2 µs (  2.2)
           ets (set) get:      804.5 µs (  1.3), set:     1534.6 µs (  1.0)
       ets x10 (set) get:      968.7 µs (  1.6)
            gb_trees get:      876.5 µs (  1.4), set:     2706.6 µs (  1.9)
               hasht get:     1324.3 µs (  2.2), set:    10091.8 µs (  6.9)
              hashtl get:     1045.4 µs (  1.7), set:     1462.6 µs (  1.0)
  process dictionary get:      604.8 µs (  1.0), set:     1465.2 µs (  1.0)
              rbdict get:      794.7 µs (  1.3), set:     1782.3 µs (  1.2)
                trie get:      636.7 µs (  1.1), set:     2463.2 µs (  1.7)
N == 2000 (10 runs)
              aadict get:     2024.5 µs (  1.1), set:     5195.9 µs (  1.6)
                dict get:     2688.9 µs (  1.5), set:     6884.1 µs (  2.1)
           ets (set) get:     2591.4 µs (  1.4), set:     5177.3 µs (  1.6)
       ets x10 (set) get:     2733.6 µs (  1.5)
            gb_trees get:     2226.3 µs (  1.2), set:     6122.8 µs (  1.9)
               hasht get:     2589.5 µs (  1.4), set:    30504.6 µs (  9.3)
              hashtl get:     2991.4 µs (  1.6), set:     3288.4 µs (  1.0)
  process dictionary get:     2114.5 µs (  1.2), set:     4030.5 µs (  1.2)
              rbdict get:     2025.2 µs (  1.1), set:     4224.6 µs (  1.3)
                trie get:     1837.8 µs (  1.0), set:     5675.4 µs (  1.7)
N == 4000 (10 runs)
              aadict get:     5926.7 µs (  1.4), set:    12774.6 µs (  1.8)
                dict get:     6796.7 µs (  1.5), set:    18243.7 µs (  2.6)
           ets (set) get:     6281.3 µs (  1.4), set:     7824.5 µs (  1.1)
       ets x10 (set) get:     8046.3 µs (  1.8)
            gb_trees get:     6301.9 µs (  1.4), set:    14376.7 µs (  2.1)
               hasht get:     5460.8 µs (  1.2), set:   103768.9 µs ( 14.9)
              hashtl get:     7137.7 µs (  1.6), set:     6957.0 µs (  1.0)
  process dictionary get:     5115.8 µs (  1.2), set:    10582.6 µs (  1.5)
              rbdict get:     5963.5 µs (  1.4), set:    10554.8 µs (  1.5)
                trie get:     4386.7 µs (  1.0), set:    13996.3 µs (  2.0)
N == 8000 (10 runs)
              aadict get:    16002.9 µs (  1.9), set:    31812.0 µs (  2.0)
                dict get:    14557.8 µs (  1.7), set:    45314.4 µs (  2.8)
           ets (set) get:     1.34e4 µs (  1.6), set:    17021.6 µs (  1.1)
       ets x10 (set) get:    17172.7 µs (  2.0)
            gb_trees get:    16952.9 µs (  2.0), set:    35821.8 µs (  2.2)
               hasht get:    11080.5 µs (  1.3), set:   408960.8 µs ( 25.5)
              hashtl get:    14790.3 µs (  1.7), set:    16040.3 µs (  1.0)
  process dictionary get:    10445.2 µs (  1.2), set:    21810.0 µs (  1.4)
              rbdict get:    15936.7 µs (  1.9), set:    28578.3 µs (  1.8)
                trie get:     8543.9 µs (  1.0), set:    27808.7 µs (  1.7)
N == 16000 (10 runs)
              aadict get:    40660.4 µs (  2.0), set:    76056.4 µs (  2.0)
                dict get:    30382.6 µs (  1.5), set:   115643.4 µs (  3.1)
           ets (set) get:    28500.2 µs (  1.4), set:    37204.1 µs (  1.0)
       ets x10 (set) get:    35423.6 µs (  1.7)
            gb_trees get:    43823.2 µs (  2.1), set:    88600.6 µs (  2.4)
               hasht get:    22523.7 µs (  1.1), set:  2302517.3 µs ( 61.9)
              hashtl get:    29762.8 µs (  1.4), set:    41664.0 µs (  1.1)
  process dictionary get:    20756.7 µs (  1.0), set:    45505.0 µs (  1.2)
              rbdict get:    40808.4 µs (  2.0), set:    67970.2 µs (  1.8)
                trie get:    21091.4 µs (  1.0), set:    83384.2 µs (  2.2)
N == 32000 (10 runs)
              aadict get:   100165.9 µs (  2.5), set:   182365.3 µs (  2.3)
                dict get:    65931.7 µs (  1.6), set:   281306.1 µs (  3.6)
           ets (set) get:    58365.9 µs (  1.4), set:    86377.8 µs (  1.1)
       ets x10 (set) get:    70832.4 µs (  1.7)
            gb_trees get:   108996.9 µs (  2.7), set:   201768.6 µs (  2.6)
               hasht get:    46683.5 µs (  1.1), set:  7559194.8 µs ( 96.3)
              hashtl get:    68377.4 µs (  1.7), set:    78516.3 µs (  1.0)
  process dictionary get:    40873.3 µs (  1.0), set:    90079.1 µs (  1.1)
              rbdict get:   101229.4 µs (  2.5), set:   160677.5 µs (  2.0)
                trie get:    52120.7 µs (  1.3), set:   205179.4 µs (  2.6)
N == 64000 (10 runs)
              aadict get:   236930.0 µs (  5.3), set:   434665.3 µs (  2.5)
                dict get:   140327.2 µs (  3.1), set:   955742.9 µs (  5.4)
           ets (set) get:   116628.2 µs (  2.6), set:   177744.3 µs (  1.0)
       ets x10 (set) get:   136503.0 µs (  3.0)
            gb_trees get:   258394.8 µs (  5.8), set:   475881.6 µs (  2.7)
               hasht get:    86721.3 µs (  1.9), set: 40883420.3 µs (232.6)
              hashtl get:   132184.1 µs (  2.9), set:   189342.2 µs (  1.1)
  process dictionary get:    80194.3 µs (  1.8), set:   175787.6 µs (  1.0)
              rbdict get:   236683.9 µs (  5.3), set:   393917.9 µs (  2.2)
                trie get:    44896.4 µs (  1.0), set:   406019.4 µs (  2.3)
TEST integer_key
N == 50 (10 runs)
              aadict get:       17.0 µs (  3.2), set:       70.9 µs (  7.2)
     array (dynamic) get:       15.5 µs (  2.9), set:       25.1 µs (  2.5)
       array (fixed) get:       15.5 µs (  2.9), set:       27.3 µs (  2.8)
                dict get:       22.9 µs (  4.3), set:       47.2 µs (  4.8)
           ets (set) get:       16.3 µs (  3.1), set:       28.9 µs (  2.9)
       ets x10 (set) get:       20.4 µs (  3.8)
            gb_trees get:       16.6 µs (  3.1), set:       99.9 µs ( 10.1)
               hasht get:       11.5 µs (  2.2), set:      117.8 µs ( 11.9)
              hashtl get:       16.0 µs (  3.0), set:       38.4 µs (  3.9)
  process dictionary get:        7.9 µs (  1.5), set:        9.9 µs (  1.0)
              rbdict get:       16.0 µs (  3.0), set:       43.2 µs (  4.4)
               tuple get:        5.3 µs (  1.0), set:       12.1 µs (  1.2)
N == 100 (10 runs)
              aadict get:       42.4 µs (  3.8), set:      127.8 µs (  6.7)
     array (dynamic) get:       35.7 µs (  3.2), set:       59.6 µs (  3.1)
       array (fixed) get:       36.5 µs (  3.3), set:       60.6 µs (  3.2)
                dict get:       58.0 µs (  5.2), set:      137.0 µs (  7.1)
           ets (set) get:       39.4 µs (  3.5), set:       70.6 µs (  3.7)
       ets x10 (set) get:       37.9 µs (  3.4)
            gb_trees get:       43.3 µs (  3.9), set:      544.8 µs ( 28.4)
               hasht get:       26.7 µs (  2.4), set:      257.0 µs ( 13.4)
              hashtl get:       40.5 µs (  3.6), set:      102.5 µs (  5.3)
  process dictionary get:       18.7 µs (  1.7), set:       19.2 µs (  1.0)
              rbdict get:       42.6 µs (  3.8), set:      120.3 µs (  6.3)
               tuple get:       11.1 µs (  1.0), set:       52.8 µs (  2.8)
N == 250 (10 runs)
              aadict get:      123.9 µs (  3.8), set:      392.2 µs (  8.6)
     array (dynamic) get:      121.6 µs (  3.7), set:      215.9 µs (  4.8)
       array (fixed) get:      121.9 µs (  3.7), set:      378.6 µs (  8.3)
                dict get:      143.7 µs (  4.4), set:      476.0 µs ( 10.5)
           ets (set) get:       98.0 µs (  3.0), set:      187.3 µs (  4.1)
       ets x10 (set) get:       73.5 µs (  2.3)
            gb_trees get:      124.0 µs (  3.8), set:      977.7 µs ( 21.5)
               hasht get:       68.0 µs (  2.1), set:      646.4 µs ( 14.2)
              hashtl get:      123.8 µs (  3.8), set:      329.9 µs (  7.3)
  process dictionary get:       40.8 µs (  1.3), set:       45.4 µs (  1.0)
              rbdict get:      124.2 µs (  3.8), set:      365.7 µs (  8.1)
               tuple get:       32.6 µs (  1.0), set:      273.5 µs (  6.0)
N == 500 (10 runs)
              aadict get:      272.6 µs (  5.8), set:      918.0 µs ( 10.3)
     array (dynamic) get:      208.8 µs (  4.5), set:      380.0 µs (  4.3)
       array (fixed) get:      209.1 µs (  4.5), set:      391.9 µs (  4.4)
                dict get:      267.8 µs (  5.7), set:     1089.2 µs ( 12.3)
           ets (set) get:      189.0 µs (  4.0), set:      339.7 µs (  3.8)
       ets x10 (set) get:      123.6 µs (  2.6)
            gb_trees get:      256.7 µs (  5.5), set:     2419.5 µs ( 27.2)
               hasht get:      152.6 µs (  3.3), set:     1541.5 µs ( 17.3)
              hashtl get:      253.2 µs (  5.4), set:      408.1 µs (  4.6)
  process dictionary get:       84.9 µs (  1.8), set:       88.9 µs (  1.0)
              rbdict get:      256.2 µs (  5.5), set:      840.5 µs (  9.5)
               tuple get:       46.9 µs (  1.0), set:     2125.3 µs ( 23.9)
N == 1000 (10 runs)
              aadict get:      478.9 µs (  5.4), set:     1683.9 µs ( 11.0)
     array (dynamic) get:      415.5 µs (  4.7), set:      857.5 µs (  5.6)
       array (fixed) get:      453.3 µs (  5.1), set:      888.8 µs (  5.8)
                dict get:      448.0 µs (  5.0), set:     2231.5 µs ( 14.5)
           ets (set) get:      334.7 µs (  3.8), set:     1006.4 µs (  6.6)
       ets x10 (set) get:      203.2 µs (  2.3)
            gb_trees get:      477.3 µs (  5.4), set:     4498.8 µs ( 29.3)
               hasht get:      321.7 µs (  3.6), set:     7766.1 µs ( 50.6)
              hashtl get:      561.4 µs (  6.3), set:      830.2 µs (  5.4)
  process dictionary get:      138.8 µs (  1.6), set:      153.6 µs (  1.0)
              rbdict get:      488.3 µs (  5.5), set:     1596.8 µs ( 10.4)
               tuple get:       89.0 µs (  1.0), set:     7330.6 µs ( 47.7)
N == 2000 (10 runs)
              aadict get:     1096.2 µs (  6.4), set:     3939.2 µs ( 13.9)
     array (dynamic) get:      996.5 µs (  5.8), set:     2026.3 µs (  7.2)
       array (fixed) get:      994.3 µs (  5.8), set:     1856.3 µs (  6.6)
                dict get:      934.0 µs (  5.5), set:     3885.4 µs ( 13.7)
           ets (set) get:      714.5 µs (  4.2), set:     2270.4 µs (  8.0)
       ets x10 (set) get:      390.3 µs (  2.3)
            gb_trees get:     1061.6 µs (  6.2), set:     9302.3 µs ( 32.9)
               hasht get:      559.6 µs (  3.3), set:    26775.2 µs ( 94.6)
              hashtl get:     1239.9 µs (  7.3), set:     1985.6 µs (  7.0)
  process dictionary get:      268.6 µs (  1.6), set:      283.1 µs (  1.0)
              rbdict get:     1083.6 µs (  6.3), set:     3509.0 µs ( 12.4)
               tuple get:      170.9 µs (  1.0), set:    22928.5 µs ( 81.0)
N == 4000 (10 runs)
              aadict get:     2598.7 µs (  7.5), set:     9143.9 µs ( 15.4)
     array (dynamic) get:     1991.0 µs (  5.7), set:     3780.1 µs (  6.3)
       array (fixed) get:     1994.8 µs (  5.8), set:     3619.8 µs (  6.1)
                dict get:     1959.2 µs (  5.7), set:    10221.3 µs ( 17.2)
           ets (set) get:     1450.3 µs (  4.2), set:     4560.7 µs (  7.7)
       ets x10 (set) get:      766.1 µs (  2.2)
            gb_trees get:     2220.8 µs (  6.4), set:    22137.9 µs ( 37.2)
               hasht get:     1032.0 µs (  3.0), set:   107334.7 µs (180.2)
              hashtl get:     2263.7 µs (  6.5), set:     3107.2 µs (  5.2)
  process dictionary get:      526.6 µs (  1.5), set:      595.6 µs (  1.0)
              rbdict get:     2581.0 µs (  7.4), set:     8322.6 µs ( 14.0)
               tuple get:      346.5 µs (  1.0), set:    91854.1 µs (154.2)
N == 8000 (10 runs)
              aadict get:     5625.3 µs (  8.6), set:    21650.5 µs ( 19.5)
     array (dynamic) get:     3998.4 µs (  6.1), set:     7457.9 µs (  6.7)
       array (fixed) get:     4014.6 µs (  6.2), set:     7380.2 µs (  6.7)
                dict get:     4200.5 µs (  6.5), set:    21589.1 µs ( 19.5)
           ets (set) get:     2958.9 µs (  4.5), set:     8535.7 µs (  7.7)
       ets x10 (set) get:     1509.6 µs (  2.3)
            gb_trees get:     4922.7 µs (  7.6), set:    55256.5 µs ( 49.9)
               hasht get:     1990.7 µs (  3.1), set:   394366.0 µs (356.0)
              hashtl get:     5715.1 µs (  8.8), set:     6898.9 µs (  6.2)
  process dictionary get:     1057.2 µs (  1.6), set:     1107.8 µs (  1.0)
              rbdict get:     5455.4 µs (  8.4), set:    24543.2 µs ( 22.2)
               tuple get:      650.9 µs (  1.0), set:   365080.7 µs (329.6)
N == 16000 (10 runs)
              aadict get:    11302.9 µs (  8.7), set:    38483.2 µs ( 15.1)
     array (dynamic) get:     9810.6 µs (  7.6), set:    21888.0 µs (  8.6)
       array (fixed) get:    10283.3 µs (  7.9), set:    21910.1 µs (  8.6)
                dict get:     8249.0 µs (  6.4), set:    60084.0 µs ( 23.6)
           ets (set) get:     5965.8 µs (  4.6), set:    14583.2 µs (  5.7)
       ets x10 (set) get:     3052.2 µs (  2.4)
            gb_trees get:    10263.4 µs (  7.9), set:   126358.0 µs ( 49.7)
               hasht get:     3966.8 µs (  3.1), set:  1570470.8 µs (617.4)
              hashtl get:    11766.2 µs (  9.1), set:    17786.2 µs (  7.0)
  process dictionary get:     2094.7 µs (  1.6), set:     2543.8 µs (  1.0)
              rbdict get:    11263.8 µs (  8.7), set:    36493.2 µs ( 14.3)
               tuple get:     1294.3 µs (  1.0), set:  1486397.5 µs (584.3)
N == 32000 (10 runs)
              aadict get:    22952.3 µs (  8.9), set:    85579.9 µs ( 14.6)
     array (dynamic) get:    19565.7 µs (  7.6), set:    40450.3 µs (  6.9)
       array (fixed) get:    19597.6 µs (  7.6), set:    39994.8 µs (  6.8)
                dict get:    17543.6 µs (  6.8), set:   185507.8 µs ( 31.7)
           ets (set) get:    13456.0 µs (  5.2), set:    29891.5 µs (  5.1)
       ets x10 (set) get:     6824.8 µs (  2.6)
            gb_trees get:    22242.6 µs (  8.6), set:   244581.5 µs ( 41.8)
               hasht get:     8159.9 µs (  3.2), set:  6891457.9 µs (*****)
              hashtl get:    24608.9 µs (  9.5), set:    43185.2 µs (  7.4)
  process dictionary get:     4210.2 µs (  1.6), set:     5853.7 µs (  1.0)
              rbdict get:    22806.8 µs (  8.8), set:    84932.4 µs ( 14.5)
               tuple get:     2581.2 µs (  1.0), set:  6422056.3 µs (*****)
N == 64000 (10 runs)
              aadict get:    47862.1 µs (  9.4), set:   191716.5 µs ( 15.6)
     array (dynamic) get:    39008.0 µs (  7.7), set:    77311.6 µs (  6.3)
       array (fixed) get:    38962.4 µs (  7.6), set:    78340.9 µs (  6.4)
                dict get:    38602.2 µs (  7.6), set:   625891.7 µs ( 50.9)
           ets (set) get:    35109.1 µs (  6.9), set:    69833.4 µs (  5.7)
       ets x10 (set) get:    16434.1 µs (  3.2)
            gb_trees get:    46772.8 µs (  9.2), set:   513581.4 µs ( 41.8)
               hasht get:    18970.6 µs (  3.7), set: 33729619.8 µs (*****)
              hashtl get:    59980.1 µs ( 11.8), set:   108446.2 µs (  8.8)
  process dictionary get:     8468.0 µs (  1.7), set:    12288.2 µs (  1.0)
              rbdict get:    48106.1 µs (  9.4), set:   178132.4 µs ( 14.5)
               tuple get:     5097.6 µs (  1.0), set: 35494023.9 µs (*****)
Leave a comment