ELF: better symbol lookup via DT_GNU_HASH

2017-05-10 |

DT_GNU_HASH is a better hash table for the ELF used by GNU systems in GNU-compatible software, i.e. in almost every program compiled with gcc or clang for almost any Linux distribution.

The problem with it is that DT_GNU_HASH is not documented anywhere other than in GNU binutils and glibc source code. You can either read source code to get some intel, or read emails with patches in mail list archives (Re: GNU_HASH section format is a pretty good one). Those are the only places you can try to find the truth on the matter.

Of course, there're some articles on the web where people try to break it down. Like this one.

This article does not aspire to be the ultimate truth either. But I'll try to cover everything about GNU Hash Table and explain all aspects of its work.

Before reading any further please ensure that you understand what Symbol Table and String Table are in the ELF. Also you may want to read my previous article ELF: symbol lookup via DT_HASH to know the standard (90s-ish) way of doing symbol lookup.

DT_GNU_HASH has nothing in common with standard DT_HASH, apart from serving the same purpose. It has its own hashing function, its own layout, it adds restrictions for the symbol table and contains an additional bloom filter to stop lookup for missing symbols early.

Hashing function

Let's start with the hashing function. It can be found in bfd_elf_gnu_hash or in dl_new_hash.

#include <stdint.h>

uint32_t gnu_hash(const uint8_t* name) {
    uint32_t h = 5381;

    for (; *name; name++) {
        h = (h << 5) + h + *name;
    }

    return h;
}

gnu_hash("")                == 0x00001505
gnu_hash("printf")          == 0x156b2bb8
gnu_hash("exit")            == 0x7c967e3f
gnu_hash("syscall")         == 0xbac212a0
gnu_hash("flapenguin.me")   == 0x8ae9f18e

Layout

Not a valid C code, but gives an idea:

struct gnu_hash_table {
    uint32_t nbuckets;
    uint32_t symoffset;
    uint32_t bloom_size;
    uint32_t bloom_shift;
    uint64_t bloom[bloom_size]; /* uint32_t for 32-bit binaries */
    uint32_t buckets[nbuckets];
    uint32_t chain[];
};

Bloom filter

Bloom filter is used to stop the lookup for missing symbols early. bloom_size, bloom_shift, and bloom are parts of the structure, as their names suggest.

Bloom filter behaves slightly differently for various ELFCLASS binaries (defined by EI_CLASS field in ELF Identification). Let's define ELFCLASS_BITS to be 64 for 64-bit binaries (ELFCLASS64) and 32 for 32-bit binaries (ELFCLASS32).

Before doing symbol lookup, take bloom[(hash / ELFCLASS_BITS) % bloom_size]. If bits hash % ELFCLASS_BITS and (hash >> bloom_shift) % ELFCLASS_BITS are set then a symbol may or may not be in the hash table, and you should proceed with a regular lookup through buckets and chains. But if at least one bit is not set then a symbol is certainly absent from the hash table.

Buckets and chains

DT_HASH contains an element per symbol table's element. This leads to a waste of space because STN_UNDEF and some other symbols are in the hash table but are never looked up. GNU hash table allows to skip first symoffset symbols at the beginning of the symbol table.

Same as in DT_HASH, symbols are put in one of nbuckets buckets depending on their hashes. To be specific, each symbol should be placed into hash % nbuckets bucket.

Chains in the GNU hash table are nothing like strange linked lists in DT_HASH, they are contiguous sequences of hashes for symbols with the same index (remember that chains' indexes are shifted by symoffset relatively to the symbol table). The last bit in chains' element is discarded and instead used for indicating the chain's end. If it is set then the element is the last one in the chain.

bucket array holds indexes of the first symbols in the chains. Note that those are not indexes for the chain array. Indexes for it will be bucket[foobar] - symoffset.

Chains being contiguous sequences imply that symbols within the same bucket must be stored contiguously. Order of buckets in the symbol table does not really matter but usually they're stored in an ascending order.

While looking extraneous, creating such restriction over the symbol table gives great advantage: a hash table now can store almost full hash (without the lowest bit) of a symbol within the same 32 bits. This allows linkers to compare hashes before comparing strings. Also, because DT_GNU_HASH requires symbol table ordering and DT_HASH doesn't, you can fit both into a single binary. This way both standard and GNU linkers can look up symbols in it.

Example

Nothing is better than a visual representation of the rules. So, let's create one.

I took the same symbols as in ELF: symbol lookup via DT_HASH and created DT_GNU_HASH table from them. The example is for 64-bit ELF binaries, for 32-bit you'll need to recalculate bloom word and bits.

nbuckets = 4      (because I decided that there will be four buckets)
symoffset = 1    (STN_UNDEF is not a part of the hash table)
bloom_size = 2   (because I decided that 16 byte bloom filter is sufficient)
bloom_shift = 5  (again, just because I can)

ix  bucket[ix]  name of first symbol in chain
--  ----------  -----------------------------
 0  1           cfsetispeed
 1  5           uselib
 2  8           freelocal
 3  13          getspen

Note that:
- symbol table is sorted by bucket
- chain[ix] is the same as hash but with set/cleared lowest bit

       SYMBOL TABLE              |              GNU HASH TABLE
                                 |
    name =                       |    hash %              bloom  bloom bits
ix  symtab[ix].st_name    hash   | ix nbuckets chain[ix]  word   #0    #1
--  ------------------  -------- | -- -------  ---------- -----  ---   ---
 0  <STN_UNDEF>                  |
 1  cfsetispeed         830acc54 |  0    0      830acc54    1    20    34
 2  strsigna            90f1e4b0 |  1    0      90f1e4b0    0    48    37
 3  hcreate_            4c7e3240 |  2    0      4c7e3240    1     0    18
 4  endrpcen            b6c44714 |  3    0      b6c44715    0    20    56
 5  uselib              2124d3e9 |  4    1      2124d3e8    1    41    31
 6  getttyen            fff51839 |  5    1      fff51838    0    57     1
 7  umoun               1081e019 |  6    1      1081e019    0    25     0
 8  freelocal           e3364372 |  7    2      e3364372    1    50    27
 9  listxatt            ced3d862 |  8    2      ced3d862    1    34     3
10  isnan               0fabfd7e |  9    2      0fabfd7e    1    62    43
11  isinf               0fabe9de | 10    2      0fabe9de    1    30    14
12  setrlimi            12e23bae | 11    2      12e23baf    0    46    29
13  getspen             f07b2a7b | 12    3      f07b2a7a    1    59    19
14  pthread_mutex_lock  4f152227 | 13    3      4f152226    0    39    17
15  getopt_long_onl     57b1584f | 14    3      57b1584f    1    15     2

Bloom filter:
   bit #      56       48       40       32       24       16        8        0
        xx..x.xx ...xxx.x xx...... x.x.xx.x ..x...x. ...x..x. ........ ......xx
        .x..x... .....x.. ....x.x. .....x.. xx..x.xx ...xxx.x xx...... x.x.xx.x

Or as two `uint64_t` values:
    cb1dc0ad22120003
    48040a04cb1dc0ad

Knowing the rules and having a built table, let's try to find some symbols by hand.

Note that when comparing hashes the lowest bit is set on both left and right hand sides.

Existing symbol. strsigna:

looking for "strsigna" (hash = 0x90f1e4b0)
checking word 0 in bloom filter for bits 48 and 37
        hash table may contain symbol
starting at ix = 1

compare hashes: (chain) 0x830acc55 == 0x90f1e4b1
wrong hash definitely not "strsigna"
moving to the next symbol

compare hashes: (chain) 0x90f1e4b1 == 0x90f1e4b1
hash matches. compare strings: "strsigna" == "strsigna"
found at index 2

Missing symbol. foobar:

looking for "foobar" (hash = 0xfde460be)
checking word 0 in bloom filter for bits 62 and 5
        not in bloom filter
not found

Missing symbol with hash collision. vLoun:

looking for "vLoun" (hash = 0x1081e019)
checking word 0 in bloom filter for bits 25 and 0
        hash table may contain symbol
starting at ix = 5

compare hashes: (chain) 0x2124d3e9 == 0x1081e019
wrong hash definitely not "vLoun"
moving to the next symbol

compare hashes: (chain) 0xfff51839 == 0x1081e019
wrong hash definitely not "vLoun"
moving to the next symbol

compare hashes: (chain) 0x1081e019 == 0x1081e019
hash matches. compare strings: "umoun" == "vLoun"
just hash collision
that was last symbol in this bucket
not found

Code

Original algorithm is implemented in do_lookup_x in ld.so source code.

Implementation is a little trickier than DT_HASH's one, but with the example above it should be self-explanatory.

/* Different architectures have different symbol structure size.
 * Those actually should be selected depending on input binary's ELFCLASS,
 * but for simplicity I've left them as typedefs and defines.
 */
typedef Elf64_Sym Elf_Sym;
typedef bloom_el_t uint64_t;
#define ELFCLASS_BITS 64
/* 32-bit binary:
    typedef Elf32_Sym Elf_Sym;
    typedef bloom_el_t uint32_t;
    #define ELFCLASS_BITS 32
*/

const Elf_Sym* gnu_lookup(
    const char* strtab,      /* string table */
    const Elf_Sym* symtab,   /* symbol table */
    const uint32_t* hashtab, /* hash table */
    const char* name         /* symbol to look up */
) {
    const uint32_t namehash = gnu_hash(name);

    const uint32_t nbuckets = hashtab[0];
    const uint32_t symoffset = hashtab[1];
    const uint32_t bloom_size = hashtab[2];
    const uint32_t bloom_shift = hashtab[3];
    const bloom_el_t* bloom = (void*)&hashtab[4];
    const uint32_t* buckets = (void*)&bloom[bloom_size];
    const uint32_t* chain = &buckets[nbuckets];

    bloom_el_t word = bloom[(namehash / ELFCLASS_BITS) % bloom_size];
    bloom_el_t mask = 0
        | (bloom_el_t)1 << (namehash % ELFCLASS_BITS)
        | (bloom_el_t)1 << ((namehash >> bloom_shift) % ELFCLASS_BITS);

    /* If at least one bit is not set, a symbol is surely missing. */
    if ((word & mask) != mask) {
        return NULL;
    }

    uint32_t symix = buckets[namehash % nbuckets];
    if (symix < symoffset) {
        return NULL;
    }

    /* Loop through the chain. */
    while (true) {
        const char* symname = strtab + symtab[symix].st_name;
        const uint32_t hash = chain[symix - symoffset];

        if ((namehash|1) == (hash|1) && strcmp(name, symname) == 0) {
            return &symtab[symix];
        }

        /* Chain ends with an element with the lowest bit set to 1. */
        if (hash & 1) {
            break;
        }

        symix++;
    }

    return NULL;
}

Total number of symbols

Total number of symbols is missing from DT_GNU_HASH for reasons I do not know. To get a total number of symbols you'll have to find a chain element with the highest index. To do so find a chain that starts at the highest index (max(bucket)) first and then walk the chain to its end (while ((chain[ix - symoffset] & 1) == 0) ix++;).

This may be useful if you only have access to dynamic program information, e.g. if your program is a dynamic linker or an ELF loader. However, if you have access to section headers, you can simply calculate the total number of symbols from section's header: just divide the section size by an entry size.

Conclusion

DT_GNU_HASH is a way better structure that DT_HASH. It is a shame it is not standardized, because every system that uses ELF uses DT_GNU_HASH to its advantage. As noted in the original GNU binutils patch, it improves linking time by up to 50%. This is twice as fast loading time! Considering how much symbols are exported and imported by an average C++ shared library and size of those symbols' names (mangling fits all namespaces, types, and template arguments into the final symbol name), obscure DT_GNU_HASH is the thing that makes your applications start as fast as they can.