1hashmap API 2=========== 3 4The hashmap API is a generic implementation of hash-based key-value mappings. 5 6Data Structures 7--------------- 8 9`struct hashmap`:: 10 11 The hash table structure. Members can be used as follows, but should 12 not be modified directly: 13+ 14The `size` member keeps track of the total number of entries (0 means the 15hashmap is empty). 16+ 17`tablesize` is the allocated size of the hash table. A non-0 value indicates 18that the hashmap is initialized. It may also be useful for statistical purposes 19(i.e. `size / tablesize` is the current load factor). 20+ 21`cmpfn` stores the comparison function specified in `hashmap_init()`. In 22advanced scenarios, it may be useful to change this, e.g. to switch between 23case-sensitive and case-insensitive lookup. 24 25`struct hashmap_entry`:: 26 27 An opaque structure representing an entry in the hash table, which must 28 be used as first member of user data structures. Ideally it should be 29 followed by an int-sized member to prevent unused memory on 64-bit 30 systems due to alignment. 31+ 32The `hash` member is the entry's hash code and the `next` member points to the 33next entry in case of collisions (i.e. if multiple entries map to the same 34bucket). 35 36`struct hashmap_iter`:: 37 38 An iterator structure, to be used with hashmap_iter_* functions. 39 40Types 41----- 42 43`int (*hashmap_cmp_fn)(const void *entry, const void *entry_or_key, const void *keydata)`:: 44 45 User-supplied function to test two hashmap entries for equality. Shall 46 return 0 if the entries are equal. 47+ 48This function is always called with non-NULL `entry` / `entry_or_key` 49parameters that have the same hash code. When looking up an entry, the `key` 50and `keydata` parameters to hashmap_get and hashmap_remove are always passed 51as second and third argument, respectively. Otherwise, `keydata` is NULL. 52 53Functions 54--------- 55 56`unsigned int strhash(const char *buf)`:: 57`unsigned int strihash(const char *buf)`:: 58`unsigned int memhash(const void *buf, size_t len)`:: 59`unsigned int memihash(const void *buf, size_t len)`:: 60 61 Ready-to-use hash functions for strings, using the FNV-1 algorithm (see 62 http://www.isthe.com/chongo/tech/comp/fnv). 63+ 64`strhash` and `strihash` take 0-terminated strings, while `memhash` and 65`memihash` operate on arbitrary-length memory. 66+ 67`strihash` and `memihash` are case insensitive versions. 68 69`unsigned int sha1hash(const unsigned char *sha1)`:: 70 71 Converts a cryptographic hash (e.g. SHA-1) into an int-sized hash code 72 for use in hash tables. Cryptographic hashes are supposed to have 73 uniform distribution, so in contrast to `memhash()`, this just copies 74 the first `sizeof(int)` bytes without shuffling any bits. Note that 75 the results will be different on big-endian and little-endian 76 platforms, so they should not be stored or transferred over the net. 77 78`void hashmap_init(struct hashmap *map, hashmap_cmp_fn equals_function, size_t initial_size)`:: 79 80 Initializes a hashmap structure. 81+ 82`map` is the hashmap to initialize. 83+ 84The `equals_function` can be specified to compare two entries for equality. 85If NULL, entries are considered equal if their hash codes are equal. 86+ 87If the total number of entries is known in advance, the `initial_size` 88parameter may be used to preallocate a sufficiently large table and thus 89prevent expensive resizing. If 0, the table is dynamically resized. 90 91`void hashmap_free(struct hashmap *map, int free_entries)`:: 92 93 Frees a hashmap structure and allocated memory. 94+ 95`map` is the hashmap to free. 96+ 97If `free_entries` is true, each hashmap_entry in the map is freed as well 98(using stdlib's free()). 99 100`void hashmap_entry_init(void *entry, unsigned int hash)`:: 101 102 Initializes a hashmap_entry structure. 103+ 104`entry` points to the entry to initialize. 105+ 106`hash` is the hash code of the entry. 107+ 108The hashmap_entry structure does not hold references to external resources, 109and it is safe to just discard it once you are done with it (i.e. if 110your structure was allocated with xmalloc(), you can just free(3) it, 111and if it is on stack, you can just let it go out of scope). 112 113`void *hashmap_get(const struct hashmap *map, const void *key, const void *keydata)`:: 114 115 Returns the hashmap entry for the specified key, or NULL if not found. 116+ 117`map` is the hashmap structure. 118+ 119`key` is a hashmap_entry structure (or user data structure that starts with 120hashmap_entry) that has at least been initialized with the proper hash code 121(via `hashmap_entry_init`). 122+ 123If an entry with matching hash code is found, `key` and `keydata` are passed 124to `hashmap_cmp_fn` to decide whether the entry matches the key. 125 126`void *hashmap_get_from_hash(const struct hashmap *map, unsigned int hash, const void *keydata)`:: 127 128 Returns the hashmap entry for the specified hash code and key data, 129 or NULL if not found. 130+ 131`map` is the hashmap structure. 132+ 133`hash` is the hash code of the entry to look up. 134+ 135If an entry with matching hash code is found, `keydata` is passed to 136`hashmap_cmp_fn` to decide whether the entry matches the key. The 137`entry_or_key` parameter points to a bogus hashmap_entry structure that 138should not be used in the comparison. 139 140`void *hashmap_get_next(const struct hashmap *map, const void *entry)`:: 141 142 Returns the next equal hashmap entry, or NULL if not found. This can be 143 used to iterate over duplicate entries (see `hashmap_add`). 144+ 145`map` is the hashmap structure. 146+ 147`entry` is the hashmap_entry to start the search from, obtained via a previous 148call to `hashmap_get` or `hashmap_get_next`. 149 150`void hashmap_add(struct hashmap *map, void *entry)`:: 151 152 Adds a hashmap entry. This allows to add duplicate entries (i.e. 153 separate values with the same key according to hashmap_cmp_fn). 154+ 155`map` is the hashmap structure. 156+ 157`entry` is the entry to add. 158 159`void *hashmap_put(struct hashmap *map, void *entry)`:: 160 161 Adds or replaces a hashmap entry. If the hashmap contains duplicate 162 entries equal to the specified entry, only one of them will be replaced. 163+ 164`map` is the hashmap structure. 165+ 166`entry` is the entry to add or replace. 167+ 168Returns the replaced entry, or NULL if not found (i.e. the entry was added). 169 170`void *hashmap_remove(struct hashmap *map, const void *key, const void *keydata)`:: 171 172 Removes a hashmap entry matching the specified key. If the hashmap 173 contains duplicate entries equal to the specified key, only one of 174 them will be removed. 175+ 176`map` is the hashmap structure. 177+ 178`key` is a hashmap_entry structure (or user data structure that starts with 179hashmap_entry) that has at least been initialized with the proper hash code 180(via `hashmap_entry_init`). 181+ 182If an entry with matching hash code is found, `key` and `keydata` are 183passed to `hashmap_cmp_fn` to decide whether the entry matches the key. 184+ 185Returns the removed entry, or NULL if not found. 186 187`void hashmap_iter_init(struct hashmap *map, struct hashmap_iter *iter)`:: 188`void *hashmap_iter_next(struct hashmap_iter *iter)`:: 189`void *hashmap_iter_first(struct hashmap *map, struct hashmap_iter *iter)`:: 190 191 Used to iterate over all entries of a hashmap. 192+ 193`hashmap_iter_init` initializes a `hashmap_iter` structure. 194+ 195`hashmap_iter_next` returns the next hashmap_entry, or NULL if there are no 196more entries. 197+ 198`hashmap_iter_first` is a combination of both (i.e. initializes the iterator 199and returns the first entry, if any). 200 201`const char *strintern(const char *string)`:: 202`const void *memintern(const void *data, size_t len)`:: 203 204 Returns the unique, interned version of the specified string or data, 205 similar to the `String.intern` API in Java and .NET, respectively. 206 Interned strings remain valid for the entire lifetime of the process. 207+ 208Can be used as `[x]strdup()` or `xmemdupz` replacement, except that interned 209strings / data must not be modified or freed. 210+ 211Interned strings are best used for short strings with high probability of 212duplicates. 213+ 214Uses a hashmap to store the pool of interned strings. 215 216Usage example 217------------- 218 219Here's a simple usage example that maps long keys to double values. 220------------ 221struct hashmap map; 222 223struct long2double { 224 struct hashmap_entry ent; /* must be the first member! */ 225 long key; 226 double value; 227}; 228 229static int long2double_cmp(const struct long2double *e1, const struct long2double *e2, const void *unused) 230{ 231 return !(e1->key == e2->key); 232} 233 234void long2double_init(void) 235{ 236 hashmap_init(&map, (hashmap_cmp_fn) long2double_cmp, 0); 237} 238 239void long2double_free(void) 240{ 241 hashmap_free(&map, 1); 242} 243 244static struct long2double *find_entry(long key) 245{ 246 struct long2double k; 247 hashmap_entry_init(&k, memhash(&key, sizeof(long))); 248 k.key = key; 249 return hashmap_get(&map, &k, NULL); 250} 251 252double get_value(long key) 253{ 254 struct long2double *e = find_entry(key); 255 return e ? e->value : 0; 256} 257 258void set_value(long key, double value) 259{ 260 struct long2double *e = find_entry(key); 261 if (!e) { 262 e = malloc(sizeof(struct long2double)); 263 hashmap_entry_init(e, memhash(&key, sizeof(long))); 264 e->key = key; 265 hashmap_add(&map, e); 266 } 267 e->value = value; 268} 269------------ 270 271Using variable-sized keys 272------------------------- 273 274The `hashmap_entry_get` and `hashmap_entry_remove` functions expect an ordinary 275`hashmap_entry` structure as key to find the correct entry. If the key data is 276variable-sized (e.g. a FLEX_ARRAY string) or quite large, it is undesirable 277to create a full-fledged entry structure on the heap and copy all the key data 278into the structure. 279 280In this case, the `keydata` parameter can be used to pass 281variable-sized key data directly to the comparison function, and the `key` 282parameter can be a stripped-down, fixed size entry structure allocated on the 283stack. 284 285See test-hashmap.c for an example using arbitrary-length strings as keys.