Documentation / technical / api-hashmap.txton commit hashmap: add simplified hashmap_get_from_hash() API (ab73a9d)
   1hashmap API
   2===========
   3
   4The hashmap API is a generic implementation of hash-based key-value mappings.
   5
   6Data Structures
   7---------------
   8
   9`struct hashmap`::
  10
  11        The hash table structure. Members can be used as follows, but should
  12        not be modified directly:
  13+
  14The `size` member keeps track of the total number of entries (0 means the
  15hashmap is empty).
  16+
  17`tablesize` is the allocated size of the hash table. A non-0 value indicates
  18that the hashmap is initialized. It may also be useful for statistical purposes
  19(i.e. `size / tablesize` is the current load factor).
  20+
  21`cmpfn` stores the comparison function specified in `hashmap_init()`. In
  22advanced scenarios, it may be useful to change this, e.g. to switch between
  23case-sensitive and case-insensitive lookup.
  24
  25`struct hashmap_entry`::
  26
  27        An opaque structure representing an entry in the hash table, which must
  28        be used as first member of user data structures. Ideally it should be
  29        followed by an int-sized member to prevent unused memory on 64-bit
  30        systems due to alignment.
  31+
  32The `hash` member is the entry's hash code and the `next` member points to the
  33next entry in case of collisions (i.e. if multiple entries map to the same
  34bucket).
  35
  36`struct hashmap_iter`::
  37
  38        An iterator structure, to be used with hashmap_iter_* functions.
  39
  40Types
  41-----
  42
  43`int (*hashmap_cmp_fn)(const void *entry, const void *entry_or_key, const void *keydata)`::
  44
  45        User-supplied function to test two hashmap entries for equality. Shall
  46        return 0 if the entries are equal.
  47+
  48This function is always called with non-NULL `entry` / `entry_or_key`
  49parameters that have the same hash code. When looking up an entry, the `key`
  50and `keydata` parameters to hashmap_get and hashmap_remove are always passed
  51as second and third argument, respectively. Otherwise, `keydata` is NULL.
  52
  53Functions
  54---------
  55
  56`unsigned int strhash(const char *buf)`::
  57`unsigned int strihash(const char *buf)`::
  58`unsigned int memhash(const void *buf, size_t len)`::
  59`unsigned int memihash(const void *buf, size_t len)`::
  60
  61        Ready-to-use hash functions for strings, using the FNV-1 algorithm (see
  62        http://www.isthe.com/chongo/tech/comp/fnv).
  63+
  64`strhash` and `strihash` take 0-terminated strings, while `memhash` and
  65`memihash` operate on arbitrary-length memory.
  66+
  67`strihash` and `memihash` are case insensitive versions.
  68
  69`unsigned int sha1hash(const unsigned char *sha1)`::
  70
  71        Converts a cryptographic hash (e.g. SHA-1) into an int-sized hash code
  72        for use in hash tables. Cryptographic hashes are supposed to have
  73        uniform distribution, so in contrast to `memhash()`, this just copies
  74        the first `sizeof(int)` bytes without shuffling any bits. Note that
  75        the results will be different on big-endian and little-endian
  76        platforms, so they should not be stored or transferred over the net.
  77
  78`void hashmap_init(struct hashmap *map, hashmap_cmp_fn equals_function, size_t initial_size)`::
  79
  80        Initializes a hashmap structure.
  81+
  82`map` is the hashmap to initialize.
  83+
  84The `equals_function` can be specified to compare two entries for equality.
  85If NULL, entries are considered equal if their hash codes are equal.
  86+
  87If the total number of entries is known in advance, the `initial_size`
  88parameter may be used to preallocate a sufficiently large table and thus
  89prevent expensive resizing. If 0, the table is dynamically resized.
  90
  91`void hashmap_free(struct hashmap *map, int free_entries)`::
  92
  93        Frees a hashmap structure and allocated memory.
  94+
  95`map` is the hashmap to free.
  96+
  97If `free_entries` is true, each hashmap_entry in the map is freed as well
  98(using stdlib's free()).
  99
 100`void hashmap_entry_init(void *entry, unsigned int hash)`::
 101
 102        Initializes a hashmap_entry structure.
 103+
 104`entry` points to the entry to initialize.
 105+
 106`hash` is the hash code of the entry.
 107
 108`void *hashmap_get(const struct hashmap *map, const void *key, const void *keydata)`::
 109
 110        Returns the hashmap entry for the specified key, or NULL if not found.
 111+
 112`map` is the hashmap structure.
 113+
 114`key` is a hashmap_entry structure (or user data structure that starts with
 115hashmap_entry) that has at least been initialized with the proper hash code
 116(via `hashmap_entry_init`).
 117+
 118If an entry with matching hash code is found, `key` and `keydata` are passed
 119to `hashmap_cmp_fn` to decide whether the entry matches the key.
 120
 121`void *hashmap_get_from_hash(const struct hashmap *map, unsigned int hash, const void *keydata)`::
 122
 123        Returns the hashmap entry for the specified hash code and key data,
 124        or NULL if not found.
 125+
 126`map` is the hashmap structure.
 127+
 128`hash` is the hash code of the entry to look up.
 129+
 130If an entry with matching hash code is found, `keydata` is passed to
 131`hashmap_cmp_fn` to decide whether the entry matches the key. The
 132`entry_or_key` parameter points to a bogus hashmap_entry structure that
 133should not be used in the comparison.
 134
 135`void *hashmap_get_next(const struct hashmap *map, const void *entry)`::
 136
 137        Returns the next equal hashmap entry, or NULL if not found. This can be
 138        used to iterate over duplicate entries (see `hashmap_add`).
 139+
 140`map` is the hashmap structure.
 141+
 142`entry` is the hashmap_entry to start the search from, obtained via a previous
 143call to `hashmap_get` or `hashmap_get_next`.
 144
 145`void hashmap_add(struct hashmap *map, void *entry)`::
 146
 147        Adds a hashmap entry. This allows to add duplicate entries (i.e.
 148        separate values with the same key according to hashmap_cmp_fn).
 149+
 150`map` is the hashmap structure.
 151+
 152`entry` is the entry to add.
 153
 154`void *hashmap_put(struct hashmap *map, void *entry)`::
 155
 156        Adds or replaces a hashmap entry. If the hashmap contains duplicate
 157        entries equal to the specified entry, only one of them will be replaced.
 158+
 159`map` is the hashmap structure.
 160+
 161`entry` is the entry to add or replace.
 162+
 163Returns the replaced entry, or NULL if not found (i.e. the entry was added).
 164
 165`void *hashmap_remove(struct hashmap *map, const void *key, const void *keydata)`::
 166
 167        Removes a hashmap entry matching the specified key. If the hashmap
 168        contains duplicate entries equal to the specified key, only one of
 169        them will be removed.
 170+
 171`map` is the hashmap structure.
 172+
 173`key` is a hashmap_entry structure (or user data structure that starts with
 174hashmap_entry) that has at least been initialized with the proper hash code
 175(via `hashmap_entry_init`).
 176+
 177If an entry with matching hash code is found, `key` and `keydata` are
 178passed to `hashmap_cmp_fn` to decide whether the entry matches the key.
 179+
 180Returns the removed entry, or NULL if not found.
 181
 182`void hashmap_iter_init(struct hashmap *map, struct hashmap_iter *iter)`::
 183`void *hashmap_iter_next(struct hashmap_iter *iter)`::
 184`void *hashmap_iter_first(struct hashmap *map, struct hashmap_iter *iter)`::
 185
 186        Used to iterate over all entries of a hashmap.
 187+
 188`hashmap_iter_init` initializes a `hashmap_iter` structure.
 189+
 190`hashmap_iter_next` returns the next hashmap_entry, or NULL if there are no
 191more entries.
 192+
 193`hashmap_iter_first` is a combination of both (i.e. initializes the iterator
 194and returns the first entry, if any).
 195
 196Usage example
 197-------------
 198
 199Here's a simple usage example that maps long keys to double values.
 200------------
 201struct hashmap map;
 202
 203struct long2double {
 204        struct hashmap_entry ent; /* must be the first member! */
 205        long key;
 206        double value;
 207};
 208
 209static int long2double_cmp(const struct long2double *e1, const struct long2double *e2, const void *unused)
 210{
 211        return !(e1->key == e2->key);
 212}
 213
 214void long2double_init(void)
 215{
 216        hashmap_init(&map, (hashmap_cmp_fn) long2double_cmp, 0);
 217}
 218
 219void long2double_free(void)
 220{
 221        hashmap_free(&map, 1);
 222}
 223
 224static struct long2double *find_entry(long key)
 225{
 226        struct long2double k;
 227        hashmap_entry_init(&k, memhash(&key, sizeof(long)));
 228        k.key = key;
 229        return hashmap_get(&map, &k, NULL);
 230}
 231
 232double get_value(long key)
 233{
 234        struct long2double *e = find_entry(key);
 235        return e ? e->value : 0;
 236}
 237
 238void set_value(long key, double value)
 239{
 240        struct long2double *e = find_entry(key);
 241        if (!e) {
 242                e = malloc(sizeof(struct long2double));
 243                hashmap_entry_init(e, memhash(&key, sizeof(long)));
 244                e->key = key;
 245                hashmap_add(&map, e);
 246        }
 247        e->value = value;
 248}
 249------------
 250
 251Using variable-sized keys
 252-------------------------
 253
 254The `hashmap_entry_get` and `hashmap_entry_remove` functions expect an ordinary
 255`hashmap_entry` structure as key to find the correct entry. If the key data is
 256variable-sized (e.g. a FLEX_ARRAY string) or quite large, it is undesirable
 257to create a full-fledged entry structure on the heap and copy all the key data
 258into the structure.
 259
 260In this case, the `keydata` parameter can be used to pass
 261variable-sized key data directly to the comparison function, and the `key`
 262parameter can be a stripped-down, fixed size entry structure allocated on the
 263stack.
 264
 265See test-hashmap.c for an example using arbitrary-length strings as keys.