Documentation / technical / api-string-list.txton commit Merge branch 'ab/t3070-test-dedup' (9af9703)
   1string-list API
   2===============
   3
   4The string_list API offers a data structure and functions to handle
   5sorted and unsorted string lists.  A "sorted" list is one whose
   6entries are sorted by string value in `strcmp()` order.
   7
   8The 'string_list' struct used to be called 'path_list', but was renamed
   9because it is not specific to paths.
  10
  11The caller:
  12
  13. Allocates and clears a `struct string_list` variable.
  14
  15. Initializes the members. You might want to set the flag `strdup_strings`
  16  if the strings should be strdup()ed. For example, this is necessary
  17  when you add something like git_path("..."), since that function returns
  18  a static buffer that will change with the next call to git_path().
  19+
  20If you need something advanced, you can manually malloc() the `items`
  21member (you need this if you add things later) and you should set the
  22`nr` and `alloc` members in that case, too.
  23
  24. Adds new items to the list, using `string_list_append`,
  25  `string_list_append_nodup`, `string_list_insert`,
  26  `string_list_split`, and/or `string_list_split_in_place`.
  27
  28. Can check if a string is in the list using `string_list_has_string` or
  29  `unsorted_string_list_has_string` and get it from the list using
  30  `string_list_lookup` for sorted lists.
  31
  32. Can sort an unsorted list using `string_list_sort`.
  33
  34. Can remove duplicate items from a sorted list using
  35  `string_list_remove_duplicates`.
  36
  37. Can remove individual items of an unsorted list using
  38  `unsorted_string_list_delete_item`.
  39
  40. Can remove items not matching a criterion from a sorted or unsorted
  41  list using `filter_string_list`, or remove empty strings using
  42  `string_list_remove_empty_items`.
  43
  44. Finally it should free the list using `string_list_clear`.
  45
  46Example:
  47
  48----
  49struct string_list list = STRING_LIST_INIT_NODUP;
  50int i;
  51
  52string_list_append(&list, "foo");
  53string_list_append(&list, "bar");
  54for (i = 0; i < list.nr; i++)
  55        printf("%s\n", list.items[i].string)
  56----
  57
  58NOTE: It is more efficient to build an unsorted list and sort it
  59afterwards, instead of building a sorted list (`O(n log n)` instead of
  60`O(n^2)`).
  61+
  62However, if you use the list to check if a certain string was added
  63already, you should not do that (using unsorted_string_list_has_string()),
  64because the complexity would be quadratic again (but with a worse factor).
  65
  66Functions
  67---------
  68
  69* General ones (works with sorted and unsorted lists as well)
  70
  71`string_list_init`::
  72
  73        Initialize the members of the string_list, set `strdup_strings`
  74        member according to the value of the second parameter.
  75
  76`filter_string_list`::
  77
  78        Apply a function to each item in a list, retaining only the
  79        items for which the function returns true.  If free_util is
  80        true, call free() on the util members of any items that have
  81        to be deleted.  Preserve the order of the items that are
  82        retained.
  83
  84`string_list_remove_empty_items`::
  85
  86        Remove any empty strings from the list.  If free_util is true,
  87        call free() on the util members of any items that have to be
  88        deleted.  Preserve the order of the items that are retained.
  89
  90`print_string_list`::
  91
  92        Dump a string_list to stdout, useful mainly for debugging purposes. It
  93        can take an optional header argument and it writes out the
  94        string-pointer pairs of the string_list, each one in its own line.
  95
  96`string_list_clear`::
  97
  98        Free a string_list. The `string` pointer of the items will be freed in
  99        case the `strdup_strings` member of the string_list is set. The second
 100        parameter controls if the `util` pointer of the items should be freed
 101        or not.
 102
 103* Functions for sorted lists only
 104
 105`string_list_has_string`::
 106
 107        Determine if the string_list has a given string or not.
 108
 109`string_list_insert`::
 110
 111        Insert a new element to the string_list. The returned pointer can be
 112        handy if you want to write something to the `util` pointer of the
 113        string_list_item containing the just added string. If the given
 114        string already exists the insertion will be skipped and the
 115        pointer to the existing item returned.
 116+
 117Since this function uses xrealloc() (which die()s if it fails) if the
 118list needs to grow, it is safe not to check the pointer. I.e. you may
 119write `string_list_insert(...)->util = ...;`.
 120
 121`string_list_lookup`::
 122
 123        Look up a given string in the string_list, returning the containing
 124        string_list_item. If the string is not found, NULL is returned.
 125
 126`string_list_remove_duplicates`::
 127
 128        Remove all but the first of consecutive entries that have the
 129        same string value.  If free_util is true, call free() on the
 130        util members of any items that have to be deleted.
 131
 132* Functions for unsorted lists only
 133
 134`string_list_append`::
 135
 136        Append a new string to the end of the string_list.  If
 137        `strdup_string` is set, then the string argument is copied;
 138        otherwise the new `string_list_entry` refers to the input
 139        string.
 140
 141`string_list_append_nodup`::
 142
 143        Append a new string to the end of the string_list.  The new
 144        `string_list_entry` always refers to the input string, even if
 145        `strdup_string` is set.  This function can be used to hand
 146        ownership of a malloc()ed string to a `string_list` that has
 147        `strdup_string` set.
 148
 149`string_list_sort`::
 150
 151        Sort the list's entries by string value in `strcmp()` order.
 152
 153`unsorted_string_list_has_string`::
 154
 155        It's like `string_list_has_string()` but for unsorted lists.
 156
 157`unsorted_string_list_lookup`::
 158
 159        It's like `string_list_lookup()` but for unsorted lists.
 160+
 161The above two functions need to look through all items, as opposed to their
 162counterpart for sorted lists, which performs a binary search.
 163
 164`unsorted_string_list_delete_item`::
 165
 166        Remove an item from a string_list. The `string` pointer of the items
 167        will be freed in case the `strdup_strings` member of the string_list
 168        is set. The third parameter controls if the `util` pointer of the
 169        items should be freed or not.
 170
 171`string_list_split`::
 172`string_list_split_in_place`::
 173
 174        Split a string into substrings on a delimiter character and
 175        append the substrings to a `string_list`.  If `maxsplit` is
 176        non-negative, then split at most `maxsplit` times.  Return the
 177        number of substrings appended to the list.
 178+
 179`string_list_split` requires a `string_list` that has `strdup_strings`
 180set to true; it leaves the input string untouched and makes copies of
 181the substrings in newly-allocated memory.
 182`string_list_split_in_place` requires a `string_list` that has
 183`strdup_strings` set to false; it splits the input string in place,
 184overwriting the delimiter characters with NULs and creating new
 185string_list_items that point into the original string (the original
 186string must therefore not be modified or freed while the `string_list`
 187is in use).
 188
 189
 190Data structures
 191---------------
 192
 193* `struct string_list_item`
 194
 195Represents an item of the list. The `string` member is a pointer to the
 196string, and you may use the `util` member for any purpose, if you want.
 197
 198* `struct string_list`
 199
 200Represents the list itself.
 201
 202. The array of items are available via the `items` member.
 203. The `nr` member contains the number of items stored in the list.
 204. The `alloc` member is used to avoid reallocating at every insertion.
 205  You should not tamper with it.
 206. Setting the `strdup_strings` member to 1 will strdup() the strings
 207  before adding them, see above.
 208. The `compare_strings_fn` member is used to specify a custom compare
 209  function, otherwise `strcmp()` is used as the default function.