Documentation / technical / api-hashmap.txton commit cherry-pick: add t3512 for submodule updates (283f56a)
   1hashmap API
   2===========
   3
   4The hashmap API is a generic implementation of hash-based key-value mappings.
   5
   6Data Structures
   7---------------
   8
   9`struct hashmap`::
  10
  11        The hash table structure.
  12+
  13The `size` member keeps track of the total number of entries. The `cmpfn`
  14member is a function used to compare two entries for equality. The `table` and
  15`tablesize` members store the hash table and its size, respectively.
  16
  17`struct hashmap_entry`::
  18
  19        An opaque structure representing an entry in the hash table, which must
  20        be used as first member of user data structures. Ideally it should be
  21        followed by an int-sized member to prevent unused memory on 64-bit
  22        systems due to alignment.
  23+
  24The `hash` member is the entry's hash code and the `next` member points to the
  25next entry in case of collisions (i.e. if multiple entries map to the same
  26bucket).
  27
  28`struct hashmap_iter`::
  29
  30        An iterator structure, to be used with hashmap_iter_* functions.
  31
  32Types
  33-----
  34
  35`int (*hashmap_cmp_fn)(const void *entry, const void *entry_or_key, const void *keydata)`::
  36
  37        User-supplied function to test two hashmap entries for equality. Shall
  38        return 0 if the entries are equal.
  39+
  40This function is always called with non-NULL `entry` / `entry_or_key`
  41parameters that have the same hash code. When looking up an entry, the `key`
  42and `keydata` parameters to hashmap_get and hashmap_remove are always passed
  43as second and third argument, respectively. Otherwise, `keydata` is NULL.
  44
  45Functions
  46---------
  47
  48`unsigned int strhash(const char *buf)`::
  49`unsigned int strihash(const char *buf)`::
  50`unsigned int memhash(const void *buf, size_t len)`::
  51`unsigned int memihash(const void *buf, size_t len)`::
  52
  53        Ready-to-use hash functions for strings, using the FNV-1 algorithm (see
  54        http://www.isthe.com/chongo/tech/comp/fnv).
  55+
  56`strhash` and `strihash` take 0-terminated strings, while `memhash` and
  57`memihash` operate on arbitrary-length memory.
  58+
  59`strihash` and `memihash` are case insensitive versions.
  60
  61`void hashmap_init(struct hashmap *map, hashmap_cmp_fn equals_function, size_t initial_size)`::
  62
  63        Initializes a hashmap structure.
  64+
  65`map` is the hashmap to initialize.
  66+
  67The `equals_function` can be specified to compare two entries for equality.
  68If NULL, entries are considered equal if their hash codes are equal.
  69+
  70If the total number of entries is known in advance, the `initial_size`
  71parameter may be used to preallocate a sufficiently large table and thus
  72prevent expensive resizing. If 0, the table is dynamically resized.
  73
  74`void hashmap_free(struct hashmap *map, int free_entries)`::
  75
  76        Frees a hashmap structure and allocated memory.
  77+
  78`map` is the hashmap to free.
  79+
  80If `free_entries` is true, each hashmap_entry in the map is freed as well
  81(using stdlib's free()).
  82
  83`void hashmap_entry_init(void *entry, unsigned int hash)`::
  84
  85        Initializes a hashmap_entry structure.
  86+
  87`entry` points to the entry to initialize.
  88+
  89`hash` is the hash code of the entry.
  90
  91`void *hashmap_get(const struct hashmap *map, const void *key, const void *keydata)`::
  92
  93        Returns the hashmap entry for the specified key, or NULL if not found.
  94+
  95`map` is the hashmap structure.
  96+
  97`key` is a hashmap_entry structure (or user data structure that starts with
  98hashmap_entry) that has at least been initialized with the proper hash code
  99(via `hashmap_entry_init`).
 100+
 101If an entry with matching hash code is found, `key` and `keydata` are passed
 102to `hashmap_cmp_fn` to decide whether the entry matches the key.
 103
 104`void *hashmap_get_next(const struct hashmap *map, const void *entry)`::
 105
 106        Returns the next equal hashmap entry, or NULL if not found. This can be
 107        used to iterate over duplicate entries (see `hashmap_add`).
 108+
 109`map` is the hashmap structure.
 110+
 111`entry` is the hashmap_entry to start the search from, obtained via a previous
 112call to `hashmap_get` or `hashmap_get_next`.
 113
 114`void hashmap_add(struct hashmap *map, void *entry)`::
 115
 116        Adds a hashmap entry. This allows to add duplicate entries (i.e.
 117        separate values with the same key according to hashmap_cmp_fn).
 118+
 119`map` is the hashmap structure.
 120+
 121`entry` is the entry to add.
 122
 123`void *hashmap_put(struct hashmap *map, void *entry)`::
 124
 125        Adds or replaces a hashmap entry. If the hashmap contains duplicate
 126        entries equal to the specified entry, only one of them will be replaced.
 127+
 128`map` is the hashmap structure.
 129+
 130`entry` is the entry to add or replace.
 131+
 132Returns the replaced entry, or NULL if not found (i.e. the entry was added).
 133
 134`void *hashmap_remove(struct hashmap *map, const void *key, const void *keydata)`::
 135
 136        Removes a hashmap entry matching the specified key. If the hashmap
 137        contains duplicate entries equal to the specified key, only one of
 138        them will be removed.
 139+
 140`map` is the hashmap structure.
 141+
 142`key` is a hashmap_entry structure (or user data structure that starts with
 143hashmap_entry) that has at least been initialized with the proper hash code
 144(via `hashmap_entry_init`).
 145+
 146If an entry with matching hash code is found, `key` and `keydata` are
 147passed to `hashmap_cmp_fn` to decide whether the entry matches the key.
 148+
 149Returns the removed entry, or NULL if not found.
 150
 151`void hashmap_iter_init(struct hashmap *map, struct hashmap_iter *iter)`::
 152`void *hashmap_iter_next(struct hashmap_iter *iter)`::
 153`void *hashmap_iter_first(struct hashmap *map, struct hashmap_iter *iter)`::
 154
 155        Used to iterate over all entries of a hashmap.
 156+
 157`hashmap_iter_init` initializes a `hashmap_iter` structure.
 158+
 159`hashmap_iter_next` returns the next hashmap_entry, or NULL if there are no
 160more entries.
 161+
 162`hashmap_iter_first` is a combination of both (i.e. initializes the iterator
 163and returns the first entry, if any).
 164
 165Usage example
 166-------------
 167
 168Here's a simple usage example that maps long keys to double values.
 169------------
 170struct hashmap map;
 171
 172struct long2double {
 173        struct hashmap_entry ent; /* must be the first member! */
 174        long key;
 175        double value;
 176};
 177
 178static int long2double_cmp(const struct long2double *e1, const struct long2double *e2, const void *unused)
 179{
 180        return !(e1->key == e2->key);
 181}
 182
 183void long2double_init(void)
 184{
 185        hashmap_init(&map, (hashmap_cmp_fn) long2double_cmp, 0);
 186}
 187
 188void long2double_free(void)
 189{
 190        hashmap_free(&map, 1);
 191}
 192
 193static struct long2double *find_entry(long key)
 194{
 195        struct long2double k;
 196        hashmap_entry_init(&k, memhash(&key, sizeof(long)));
 197        k.key = key;
 198        return hashmap_get(&map, &k, NULL);
 199}
 200
 201double get_value(long key)
 202{
 203        struct long2double *e = find_entry(key);
 204        return e ? e->value : 0;
 205}
 206
 207void set_value(long key, double value)
 208{
 209        struct long2double *e = find_entry(key);
 210        if (!e) {
 211                e = malloc(sizeof(struct long2double));
 212                hashmap_entry_init(e, memhash(&key, sizeof(long)));
 213                e->key = key;
 214                hashmap_add(&map, e);
 215        }
 216        e->value = value;
 217}
 218------------
 219
 220Using variable-sized keys
 221-------------------------
 222
 223The `hashmap_entry_get` and `hashmap_entry_remove` functions expect an ordinary
 224`hashmap_entry` structure as key to find the correct entry. If the key data is
 225variable-sized (e.g. a FLEX_ARRAY string) or quite large, it is undesirable
 226to create a full-fledged entry structure on the heap and copy all the key data
 227into the structure.
 228
 229In this case, the `keydata` parameter can be used to pass
 230variable-sized key data directly to the comparison function, and the `key`
 231parameter can be a stripped-down, fixed size entry structure allocated on the
 232stack.
 233
 234See test-hashmap.c for an example using arbitrary-length strings as keys.