Documentation / technical / protocol-common.txton commit avoid segfault when reading header of malformed commits (a9c7a8a)
   1Documentation Common to Pack and Http Protocols
   2===============================================
   3
   4ABNF Notation
   5-------------
   6
   7ABNF notation as described by RFC 5234 is used within the protocol documents,
   8except the following replacement core rules are used:
   9----
  10  HEXDIG    =  DIGIT / "a" / "b" / "c" / "d" / "e" / "f"
  11----
  12
  13We also define the following common rules:
  14----
  15  NUL       =  %x00
  16  zero-id   =  40*"0"
  17  obj-id    =  40*(HEXDIGIT)
  18
  19  refname  =  "HEAD"
  20  refname /=  "refs/" <see discussion below>
  21----
  22
  23A refname is a hierarchical octet string beginning with "refs/" and
  24not violating the 'git-check-ref-format' command's validation rules.
  25More specifically, they:
  26
  27. They can include slash `/` for hierarchical (directory)
  28  grouping, but no slash-separated component can begin with a
  29  dot `.`.
  30
  31. They must contain at least one `/`. This enforces the presence of a
  32  category like `heads/`, `tags/` etc. but the actual names are not
  33  restricted.
  34
  35. They cannot have two consecutive dots `..` anywhere.
  36
  37. They cannot have ASCII control characters (i.e. bytes whose
  38  values are lower than \040, or \177 `DEL`), space, tilde `~`,
  39  caret `{caret}`, colon `:`, question-mark `?`, asterisk `*`,
  40  or open bracket `[` anywhere.
  41
  42. They cannot end with a slash `/` nor a dot `.`.
  43
  44. They cannot end with the sequence `.lock`.
  45
  46. They cannot contain a sequence `@{`.
  47
  48. They cannot contain a `\\`.
  49
  50
  51pkt-line Format
  52---------------
  53
  54Much (but not all) of the payload is described around pkt-lines.
  55
  56A pkt-line is a variable length binary string.  The first four bytes
  57of the line, the pkt-len, indicates the total length of the line,
  58in hexadecimal.  The pkt-len includes the 4 bytes used to contain
  59the length's hexadecimal representation.
  60
  61A pkt-line MAY contain binary data, so implementors MUST ensure
  62pkt-line parsing/formatting routines are 8-bit clean.
  63
  64A non-binary line SHOULD BE terminated by an LF, which if present
  65MUST be included in the total length.
  66
  67The maximum length of a pkt-line's data component is 65520 bytes.
  68Implementations MUST NOT send pkt-line whose length exceeds 65524
  69(65520 bytes of payload + 4 bytes of length data).
  70
  71Implementations SHOULD NOT send an empty pkt-line ("0004").
  72
  73A pkt-line with a length field of 0 ("0000"), called a flush-pkt,
  74is a special case and MUST be handled differently than an empty
  75pkt-line ("0004").
  76
  77----
  78  pkt-line     =  data-pkt / flush-pkt
  79
  80  data-pkt     =  pkt-len pkt-payload
  81  pkt-len      =  4*(HEXDIG)
  82  pkt-payload  =  (pkt-len - 4)*(OCTET)
  83
  84  flush-pkt    = "0000"
  85----
  86
  87Examples (as C-style strings):
  88
  89----
  90  pkt-line          actual value
  91  ---------------------------------
  92  "0006a\n"         "a\n"
  93  "0005a"           "a"
  94  "000bfoobar\n"    "foobar\n"
  95  "0004"            ""
  96----