Documentation / i18n.txton commit Merge branch 'pb/patch-id-plus' (4d54494)
   1At the core level, git is character encoding agnostic.
   2
   3 - The pathnames recorded in the index and in the tree objects
   4   are treated as uninterpreted sequences of non-NUL bytes.
   5   What readdir(2) returns are what are recorded and compared
   6   with the data git keeps track of, which in turn are expected
   7   to be what lstat(2) and creat(2) accepts.  There is no such
   8   thing as pathname encoding translation.
   9
  10 - The contents of the blob objects are uninterpreted sequences
  11   of bytes.  There is no encoding translation at the core
  12   level.
  13
  14 - The commit log messages are uninterpreted sequences of non-NUL
  15   bytes.
  16
  17Although we encourage that the commit log messages are encoded
  18in UTF-8, both the core and git Porcelain are designed not to
  19force UTF-8 on projects.  If all participants of a particular
  20project find it more convenient to use legacy encodings, git
  21does not forbid it.  However, there are a few things to keep in
  22mind.
  23
  24. 'git commit' and 'git commit-tree' issues
  25  a warning if the commit log message given to it does not look
  26  like a valid UTF-8 string, unless you explicitly say your
  27  project uses a legacy encoding.  The way to say this is to
  28  have i18n.commitencoding in `.git/config` file, like this:
  29+
  30------------
  31[i18n]
  32        commitencoding = ISO-8859-1
  33------------
  34+
  35Commit objects created with the above setting record the value
  36of `i18n.commitencoding` in its `encoding` header.  This is to
  37help other people who look at them later.  Lack of this header
  38implies that the commit log message is encoded in UTF-8.
  39
  40. 'git log', 'git show', 'git blame' and friends look at the
  41  `encoding` header of a commit object, and try to re-code the
  42  log message into UTF-8 unless otherwise specified.  You can
  43  specify the desired output encoding with
  44  `i18n.logoutputencoding` in `.git/config` file, like this:
  45+
  46------------
  47[i18n]
  48        logoutputencoding = ISO-8859-1
  49------------
  50+
  51If you do not have this configuration variable, the value of
  52`i18n.commitencoding` is used instead.
  53
  54Note that we deliberately chose not to re-code the commit log
  55message when a commit is made to force UTF-8 at the commit
  56object level, because re-coding to UTF-8 is not necessarily a
  57reversible operation.