You are viewing okeuday

Michael Truog - trie retest
okeuday
trie retest
The older trie (speed) testing didn't experiment with Erlang compilation options and used Erlang R14B01 (newest, at the time). To examine how performance has changed, I wanted to rerun the tests with Erlang R15B03 on newer hardware. I have tested various HiPE optimization levels (Erlang configuration: --enable-threads --enable-smp-support --enable-kernel-poll --enable-hipe --enable-native-libs) and running without optimization (Erlang configuration: --enable-threads --enable-smp-support --enable-kernel-poll --disable-hipe). The hardware for these tests is: Core i7 2670QM 2.2GHz 4 cores, 8 hyper-threads, L2:4×256KB L3:6MB RAM:8GB:DDR3-1333MHz, Sandy Bridge-HE-4 (Socket G2), Linux 3.2.0-29-generic x86_64, Ubuntu 12.04.1 LTS. To understand the Erlang compilation options for each optimization level, look at the makefile.

The older results and the summary showed similar performance between the trie and the process dictionary with larger tries (50000 elements), for the trie:find/2 operation (shown as "get" in the results). The new testing only deals with N as 1000, 10000, and 100000 instead of 1000, 10000, and 50000, to look at orders of magnitude difference. When doing the new testing, the smaller tries (1000 elements) showed similar performance between the trie and the process dictionary (i.e., the ideal usage changed). The difference became more dramatic with HiPE optimization level 1, such that the trie outperformed the process dictionary for N as 1000 and 10000. Outperforming the Erlang process dictionary is very surprising and atypical since the process dictionary is normally the fastest storage available, but it is normally avoided since it eliminates the source code's referential transparency making the code harder to test, maintain, and understand.

Using the trie data structure instead of the process dictionary allows you to provide a scope for your data, since an Erlang process only has a single shared process dictionary (i.e., one process dictionary per Erlang process). The trie data structure also allows the state to be saved as Erlang terms, so it can easily be put in ets, a database, or sent between processes.

While it is unclear if HiPE is used within production Erlang systems (by people that need stability), it is good to see that the trie can benefit from the simplest optimizations available to Erlang (HiPE optimization level 1). The exact source of the trie performance change between R14B01 and R15B03 is also unclear because of how many changes influence the Erlang runtime system, but the trend from larger tries being faster to smaller tries being faster helps to make the trie data structure more applicable to common tasks, as the best solution for storing associative lookup data with string keys (where a string is a list of integers, not a binary). Feel free to try the string_key test within your own software development environment.

make OPTIMIZE=0
erl -noshell -pz ebin -s run test -s init stop
TEST string_key
N == 1000 (10 runs)
              aadict get:      431.8 us (  1.4), set:     1116.5 us (  1.5)
                dict get:      443.7 us (  1.5), set:     1791.7 us (  2.4)
   ets (ordered_set) get:      791.9 us (  2.6), set:     1497.6 us (  2.0)
           ets (set) get:      364.4 us (  1.2), set:      890.3 us (  1.2)
ets x10 read (ordere get:      655.3 us (  2.2)
  ets x10 read (set) get:      593.7 us (  2.0)
            gb_trees get:      459.4 us (  1.5), set:     1518.7 us (  2.0)
              hashtl get:      915.7 us (  3.0), set:     1270.2 us (  1.7)
  process dictionary get:      316.6 us (  1.0), set:      741.5 us (  1.0)
              rbdict get:      420.0 us (  1.4), set:      940.5 us (  1.3)
                trie get:      302.9 us (  1.0), set:      978.1 us (  1.3)
N == 10000 (10 runs)
              aadict get:     8903.2 us (  1.4), set:    20652.3 us (  2.0)
                dict get:     8144.9 us (  1.3), set:    26353.0 us (  2.6)
   ets (ordered_set) get:     9488.3 us (  1.5), set:    12515.1 us (  1.2)
           ets (set) get:     8149.0 us (  1.3), set:    10223.1 us (  1.0)
ets x10 read (ordere get:     7549.7 us (  1.2)
  ets x10 read (set) get:     6636.2 us (  1.1)
            gb_trees get:     9630.4 us (  1.6), set:    24495.7 us (  2.4)
              hashtl get:     9724.2 us (  1.6), set:    16015.6 us (  1.6)
  process dictionary get:     6178.1 us (  1.0), set:    14000.4 us (  1.4)
              rbdict get:     8522.1 us (  1.4), set:    18301.5 us (  1.8)
                trie get:     7215.9 us (  1.2), set:    18760.3 us (  1.8)
N == 100000 (10 runs)
              aadict get:   217818.2 us (  2.9), set:   363481.8 us (  3.1)
                dict get:   120559.7 us (  1.6), set:   702588.1 us (  6.0)
   ets (ordered_set) get:   140960.1 us (  1.9), set:   160380.2 us (  1.4)
           ets (set) get:    92217.0 us (  1.2), set:   117812.8 us (  1.0)
ets x10 read (ordere get:    94970.5 us (  1.3)
  ets x10 read (set) get:    77114.3 us (  1.0)
            gb_trees get:   240592.4 us (  3.2), set:   413380.8 us (  3.5)
              hashtl get:   123689.9 us (  1.7), set:   162366.1 us (  1.4)
  process dictionary get:    74274.1 us (  1.0), set:   145834.0 us (  1.2)
              rbdict get:   216569.4 us (  2.9), set:   329517.9 us (  2.8)
                trie get:   128545.4 us (  1.7), set:   469294.1 us (  4.0)

make OPTIMIZE=1
erl -noshell -pz ebin -s run test -s init stop
TEST string_key
N == 1000 (10 runs)
              aadict get:      364.6 us (  2.5), set:      682.8 us (  1.2)
                dict get:      338.6 us (  2.3), set:     1243.5 us (  2.3)
   ets (ordered_set) get:      575.0 us (  3.9), set:      795.2 us (  1.4)
           ets (set) get:      344.1 us (  2.3), set:      572.9 us (  1.0)
ets x10 read (ordere get:      569.3 us (  3.8)
  ets x10 read (set) get:      675.6 us (  4.6)
            gb_trees get:      404.0 us (  2.7), set:     1024.3 us (  1.9)
              hashtl get:      488.2 us (  3.3), set:      571.5 us (  1.0)
  process dictionary get:      228.3 us (  1.5), set:      783.6 us (  1.4)
              rbdict get:      350.9 us (  2.4), set:      551.9 us (  1.0)
                trie get:      148.2 us (  1.0), set:      637.4 us (  1.2)
N == 10000 (10 runs)
              aadict get:     8243.8 us (  1.5), set:    14899.0 us (  1.3)
                dict get:     7905.4 us (  1.5), set:    22639.6 us (  2.0)
   ets (ordered_set) get:     9647.8 us (  1.8), set:    12273.3 us (  1.1)
           ets (set) get:     8741.0 us (  1.6), set:    11131.5 us (  1.0)
ets x10 read (ordere get:     8152.5 us (  1.5)
  ets x10 read (set) get:     7511.2 us (  1.4)
            gb_trees get:     8854.4 us (  1.6), set:    17573.0 us (  1.6)
              hashtl get:     8872.8 us (  1.6), set:    14108.0 us (  1.3)
  process dictionary get:     6474.1 us (  1.2), set:    15250.0 us (  1.4)
              rbdict get:     7876.7 us (  1.5), set:    12443.8 us (  1.1)
                trie get:     5395.0 us (  1.0), set:    13855.8 us (  1.2)
N == 100000 (10 runs)
              aadict get:   203331.7 us (  3.2), set:   290739.7 us (  2.8)
                dict get:   115243.7 us (  1.8), set:   636958.5 us (  6.0)
   ets (ordered_set) get:   138594.7 us (  2.1), set:   156941.5 us (  1.5)
           ets (set) get:    83570.1 us (  1.3), set:   105417.7 us (  1.0)
ets x10 read (ordere get:    95338.7 us (  1.5)
  ets x10 read (set) get:    68988.9 us (  1.1)
            gb_trees get:   228740.4 us (  3.5), set:   334276.2 us (  3.2)
              hashtl get:   106109.2 us (  1.6), set:   119801.5 us (  1.1)
  process dictionary get:    64525.8 us (  1.0), set:   137005.9 us (  1.3)
              rbdict get:   199559.6 us (  3.1), set:   257816.9 us (  2.4)
                trie get:    95711.2 us (  1.5), set:   426041.9 us (  4.0)

make OPTIMIZE=2
erl -noshell -pz ebin -s run test -s init stop
TEST string_key
N == 1000 (10 runs)
              aadict get:      473.4 us (  3.3), set:      763.9 us (  1.4)
                dict get:      435.0 us (  3.0), set:     1547.7 us (  2.9)
   ets (ordered_set) get:      574.4 us (  4.0), set:      805.5 us (  1.5)
           ets (set) get:      339.6 us (  2.3), set:      589.4 us (  1.1)
ets x10 read (ordere get:      623.7 us (  4.3)
  ets x10 read (set) get:      585.3 us (  4.0)
            gb_trees get:      408.5 us (  2.8), set:     1014.8 us (  1.9)
              hashtl get:      492.9 us (  3.4), set:      597.2 us (  1.1)
  process dictionary get:      222.7 us (  1.5), set:      791.6 us (  1.5)
              rbdict get:      352.2 us (  2.4), set:      540.2 us (  1.0)
                trie get:      144.9 us (  1.0), set:      775.0 us (  1.4)
N == 10000 (10 runs)
              aadict get:     8240.3 us (  1.6), set:    14352.5 us (  1.4)
                dict get:     7820.1 us (  1.5), set:    22096.4 us (  2.1)
   ets (ordered_set) get:     9841.4 us (  1.9), set:    12463.4 us (  1.2)
           ets (set) get:     8539.8 us (  1.6), set:    10536.7 us (  1.0)
ets x10 read (ordere get:     8252.2 us (  1.6)
  ets x10 read (set) get:     7503.8 us (  1.4)
            gb_trees get:     8956.5 us (  1.7), set:    17869.8 us (  1.7)
              hashtl get:     8733.6 us (  1.7), set:    15034.7 us (  1.4)
  process dictionary get:     6167.1 us (  1.2), set:    14606.7 us (  1.4)
              rbdict get:     7936.7 us (  1.5), set:    12769.4 us (  1.2)
                trie get:     5220.3 us (  1.0), set:    14063.3 us (  1.3)
N == 100000 (10 runs)
              aadict get:   213701.7 us (  3.1), set:   299393.6 us (  2.7)
                dict get:   118862.2 us (  1.7), set:   637653.9 us (  5.7)
   ets (ordered_set) get:   131884.9 us (  1.9), set:   151626.0 us (  1.4)
           ets (set) get:    90851.1 us (  1.3), set:   112281.9 us (  1.0)
ets x10 read (ordere get:    90603.8 us (  1.3)
  ets x10 read (set) get:    74600.5 us (  1.1)
            gb_trees get:   234143.5 us (  3.4), set:   337062.8 us (  3.0)
              hashtl get:    99279.3 us (  1.4), set:   118642.3 us (  1.1)
  process dictionary get:    69277.4 us (  1.0), set:   141672.6 us (  1.3)
              rbdict get:   206904.5 us (  3.0), set:   265790.1 us (  2.4)
                trie get:   100507.5 us (  1.5), set:   437189.1 us (  3.9)

make OPTIMIZE=3
erl -noshell -pz ebin -s run test -s init stop
TEST string_key
N == 1000 (10 runs)
              aadict get:      465.3 us (  2.6), set:      790.3 us (  1.4)
                dict get:      424.2 us (  2.3), set:     1523.1 us (  2.7)
   ets (ordered_set) get:      569.2 us (  3.1), set:      798.2 us (  1.4)
           ets (set) get:      333.5 us (  1.8), set:      567.9 us (  1.0)
ets x10 read (ordere get:      598.3 us (  3.3)
  ets x10 read (set) get:      615.6 us (  3.4)
            gb_trees get:      509.1 us (  2.8), set:     1186.2 us (  2.1)
              hashtl get:      464.7 us (  2.6), set:      585.9 us (  1.0)
  process dictionary get:      224.2 us (  1.2), set:      817.1 us (  1.4)
              rbdict get:      449.9 us (  2.5), set:      639.7 us (  1.1)
                trie get:      181.9 us (  1.0), set:      839.4 us (  1.5)
N == 10000 (10 runs)
              aadict get:     8498.7 us (  1.6), set:    15190.0 us (  1.4)
                dict get:     7812.6 us (  1.5), set:    21764.6 us (  2.1)
   ets (ordered_set) get:     9503.9 us (  1.8), set:    12374.8 us (  1.2)
           ets (set) get:     8636.0 us (  1.7), set:    10541.9 us (  1.0)
ets x10 read (ordere get:     8148.1 us (  1.6)
  ets x10 read (set) get:     7765.6 us (  1.5)
            gb_trees get:     8986.5 us (  1.7), set:    17331.9 us (  1.6)
              hashtl get:     9372.1 us (  1.8), set:    15798.2 us (  1.5)
  process dictionary get:     6215.1 us (  1.2), set:    14557.3 us (  1.4)
              rbdict get:     7963.2 us (  1.5), set:    12940.2 us (  1.2)
                trie get:     5199.3 us (  1.0), set:    14114.5 us (  1.3)
N == 100000 (10 runs)
              aadict get:   203343.0 us (  3.0), set:   302129.4 us (  2.7)
                dict get:   109768.0 us (  1.6), set:   610699.3 us (  5.4)
   ets (ordered_set) get:   126994.1 us (  1.9), set:   144943.7 us (  1.3)
           ets (set) get:    87807.5 us (  1.3), set:   113148.6 us (  1.0)
ets x10 read (ordere get:    81821.7 us (  1.2)
  ets x10 read (set) get:    72250.2 us (  1.1)
            gb_trees get:   228222.0 us (  3.3), set:   336378.5 us (  3.0)
              hashtl get:    90128.9 us (  1.3), set:   128015.4 us (  1.1)
  process dictionary get:    68322.9 us (  1.0), set:   141096.2 us (  1.2)
              rbdict get:   196716.5 us (  2.9), set:   266625.2 us (  2.4)
                trie get:    99331.1 us (  1.5), set:   422425.9 us (  3.7)

make OPTIMIZE=4
erl -noshell -pz ebin -s run test -s init stop
TEST string_key
N == 1000 (10 runs)
              aadict get:      364.5 us (  2.4), set:      681.8 us (  1.1)
                dict get:      341.7 us (  2.3), set:     1184.9 us (  1.8)
   ets (ordered_set) get:      586.1 us (  3.9), set:      867.0 us (  1.3)
           ets (set) get:      336.7 us (  2.2), set:      648.8 us (  1.0)
ets x10 read (ordere get:      567.0 us (  3.8)
  ets x10 read (set) get:      559.8 us (  3.7)
            gb_trees get:      507.6 us (  3.4), set:     1226.4 us (  1.9)
              hashtl get:     1015.4 us (  6.7), set:     2740.5 us (  4.2)
  process dictionary get:      226.9 us (  1.5), set:      731.0 us (  1.1)
              rbdict get:      461.8 us (  3.1), set:      658.7 us (  1.0)
                trie get:      150.8 us (  1.0), set:      757.3 us (  1.2)
N == 10000 (10 runs)
              aadict get:     7712.4 us (  1.5), set:    15373.6 us (  1.5)
                dict get:     7464.2 us (  1.4), set:    21264.0 us (  2.1)
   ets (ordered_set) get:     8903.2 us (  1.7), set:    11940.7 us (  1.2)
           ets (set) get:     8213.1 us (  1.6), set:    10315.6 us (  1.0)
ets x10 read (ordere get:     7535.7 us (  1.4)
  ets x10 read (set) get:     6949.5 us (  1.3)
            gb_trees get:     8625.9 us (  1.6), set:    17152.0 us (  1.7)
              hashtl get:    15289.7 us (  2.9), set:    30531.0 us (  3.0)
  process dictionary get:     5988.8 us (  1.1), set:    14283.2 us (  1.4)
              rbdict get:     7701.0 us (  1.5), set:    14704.0 us (  1.4)
                trie get:     5275.3 us (  1.0), set:    14217.7 us (  1.4)
N == 100000 (10 runs)
              aadict get:   198179.8 us (  2.9), set:   292331.5 us (  2.7)
                dict get:   109865.6 us (  1.6), set:   552816.3 us (  5.0)
   ets (ordered_set) get:   135377.6 us (  2.0), set:   155204.8 us (  1.4)
           ets (set) get:    88240.8 us (  1.3), set:   110088.4 us (  1.0)
ets x10 read (ordere get:    97416.2 us (  1.4)
  ets x10 read (set) get:    74857.3 us (  1.1)
            gb_trees get:   221469.0 us (  3.2), set:   338019.2 us (  3.1)
              hashtl get:   101314.6 us (  1.5), set:   212521.6 us (  1.9)
  process dictionary get:    68286.0 us (  1.0), set:   139829.3 us (  1.3)
              rbdict get:   194049.9 us (  2.8), set:   263348.2 us (  2.4)
                trie get:    94791.4 us (  1.4), set:   399552.5 us (  3.6)
profile
User: okeuday
Name: Michael Truog
Website: My Website
calendar
Back December 2012
1
2345678
9101112131415
16171819202122
23242526272829
3031
links
License
tags