Documentation / technical / http-protocol.txton commit Merge branch 'en/t6036-recursive-corner-cases' (bba1a55)
   1HTTP transfer protocols
   2=======================
   3
   4Git supports two HTTP based transfer protocols.  A "dumb" protocol
   5which requires only a standard HTTP server on the server end of the
   6connection, and a "smart" protocol which requires a Git aware CGI
   7(or server module).  This document describes both protocols.
   8
   9As a design feature smart clients can automatically upgrade "dumb"
  10protocol URLs to smart URLs.  This permits all users to have the
  11same published URL, and the peers automatically select the most
  12efficient transport available to them.
  13
  14
  15URL Format
  16----------
  17
  18URLs for Git repositories accessed by HTTP use the standard HTTP
  19URL syntax documented by RFC 1738, so they are of the form:
  20
  21  http://<host>:<port>/<path>?<searchpart>
  22
  23Within this documentation the placeholder `$GIT_URL` will stand for
  24the http:// repository URL entered by the end-user.
  25
  26Servers SHOULD handle all requests to locations matching `$GIT_URL`, as
  27both the "smart" and "dumb" HTTP protocols used by Git operate
  28by appending additional path components onto the end of the user
  29supplied `$GIT_URL` string.
  30
  31An example of a dumb client requesting for a loose object:
  32
  33  $GIT_URL:     http://example.com:8080/git/repo.git
  34  URL request:  http://example.com:8080/git/repo.git/objects/d0/49f6c27a2244e12041955e262a404c7faba355
  35
  36An example of a smart request to a catch-all gateway:
  37
  38  $GIT_URL:     http://example.com/daemon.cgi?svc=git&q=
  39  URL request:  http://example.com/daemon.cgi?svc=git&q=/info/refs&service=git-receive-pack
  40
  41An example of a request to a submodule:
  42
  43  $GIT_URL:     http://example.com/git/repo.git/path/submodule.git
  44  URL request:  http://example.com/git/repo.git/path/submodule.git/info/refs
  45
  46Clients MUST strip a trailing `/`, if present, from the user supplied
  47`$GIT_URL` string to prevent empty path tokens (`//`) from appearing
  48in any URL sent to a server.  Compatible clients MUST expand
  49`$GIT_URL/info/refs` as `foo/info/refs` and not `foo//info/refs`.
  50
  51
  52Authentication
  53--------------
  54
  55Standard HTTP authentication is used if authentication is required
  56to access a repository, and MAY be configured and enforced by the
  57HTTP server software.
  58
  59Because Git repositories are accessed by standard path components
  60server administrators MAY use directory based permissions within
  61their HTTP server to control repository access.
  62
  63Clients SHOULD support Basic authentication as described by RFC 2617.
  64Servers SHOULD support Basic authentication by relying upon the
  65HTTP server placed in front of the Git server software.
  66
  67Servers SHOULD NOT require HTTP cookies for the purposes of
  68authentication or access control.
  69
  70Clients and servers MAY support other common forms of HTTP based
  71authentication, such as Digest authentication.
  72
  73
  74SSL
  75---
  76
  77Clients and servers SHOULD support SSL, particularly to protect
  78passwords when relying on Basic HTTP authentication.
  79
  80
  81Session State
  82-------------
  83
  84The Git over HTTP protocol (much like HTTP itself) is stateless
  85from the perspective of the HTTP server side.  All state MUST be
  86retained and managed by the client process.  This permits simple
  87round-robin load-balancing on the server side, without needing to
  88worry about state management.
  89
  90Clients MUST NOT require state management on the server side in
  91order to function correctly.
  92
  93Servers MUST NOT require HTTP cookies in order to function correctly.
  94Clients MAY store and forward HTTP cookies during request processing
  95as described by RFC 2616 (HTTP/1.1).  Servers SHOULD ignore any
  96cookies sent by a client.
  97
  98
  99General Request Processing
 100--------------------------
 101
 102Except where noted, all standard HTTP behavior SHOULD be assumed
 103by both client and server.  This includes (but is not necessarily
 104limited to):
 105
 106If there is no repository at `$GIT_URL`, or the resource pointed to by a
 107location matching `$GIT_URL` does not exist, the server MUST NOT respond
 108with `200 OK` response.  A server SHOULD respond with
 109`404 Not Found`, `410 Gone`, or any other suitable HTTP status code
 110which does not imply the resource exists as requested.
 111
 112If there is a repository at `$GIT_URL`, but access is not currently
 113permitted, the server MUST respond with the `403 Forbidden` HTTP
 114status code.
 115
 116Servers SHOULD support both HTTP 1.0 and HTTP 1.1.
 117Servers SHOULD support chunked encoding for both request and response
 118bodies.
 119
 120Clients SHOULD support both HTTP 1.0 and HTTP 1.1.
 121Clients SHOULD support chunked encoding for both request and response
 122bodies.
 123
 124Servers MAY return ETag and/or Last-Modified headers.
 125
 126Clients MAY revalidate cached entities by including If-Modified-Since
 127and/or If-None-Match request headers.
 128
 129Servers MAY return `304 Not Modified` if the relevant headers appear
 130in the request and the entity has not changed.  Clients MUST treat
 131`304 Not Modified` identical to `200 OK` by reusing the cached entity.
 132
 133Clients MAY reuse a cached entity without revalidation if the
 134Cache-Control and/or Expires header permits caching.  Clients and
 135servers MUST follow RFC 2616 for cache controls.
 136
 137
 138Discovering References
 139----------------------
 140
 141All HTTP clients MUST begin either a fetch or a push exchange by
 142discovering the references available on the remote repository.
 143
 144Dumb Clients
 145~~~~~~~~~~~~
 146
 147HTTP clients that only support the "dumb" protocol MUST discover
 148references by making a request for the special info/refs file of
 149the repository.
 150
 151Dumb HTTP clients MUST make a `GET` request to `$GIT_URL/info/refs`,
 152without any search/query parameters.
 153
 154   C: GET $GIT_URL/info/refs HTTP/1.0
 155
 156   S: 200 OK
 157   S:
 158   S: 95dcfa3633004da0049d3d0fa03f80589cbcaf31  refs/heads/maint
 159   S: d049f6c27a2244e12041955e262a404c7faba355  refs/heads/master
 160   S: 2cb58b79488a98d2721cea644875a8dd0026b115  refs/tags/v1.0
 161   S: a3c2e2402b99163d1d59756e5f207ae21cccba4c  refs/tags/v1.0^{}
 162
 163The Content-Type of the returned info/refs entity SHOULD be
 164`text/plain; charset=utf-8`, but MAY be any content type.
 165Clients MUST NOT attempt to validate the returned Content-Type.
 166Dumb servers MUST NOT return a return type starting with
 167`application/x-git-`.
 168
 169Cache-Control headers MAY be returned to disable caching of the
 170returned entity.
 171
 172When examining the response clients SHOULD only examine the HTTP
 173status code.  Valid responses are `200 OK`, or `304 Not Modified`.
 174
 175The returned content is a UNIX formatted text file describing
 176each ref and its known value.  The file SHOULD be sorted by name
 177according to the C locale ordering.  The file SHOULD NOT include
 178the default ref named `HEAD`.
 179
 180  info_refs   =  *( ref_record )
 181  ref_record  =  any_ref / peeled_ref
 182
 183  any_ref     =  obj-id HTAB refname LF
 184  peeled_ref  =  obj-id HTAB refname LF
 185                 obj-id HTAB refname "^{}" LF
 186
 187Smart Clients
 188~~~~~~~~~~~~~
 189
 190HTTP clients that support the "smart" protocol (or both the
 191"smart" and "dumb" protocols) MUST discover references by making
 192a parameterized request for the info/refs file of the repository.
 193
 194The request MUST contain exactly one query parameter,
 195`service=$servicename`, where `$servicename` MUST be the service
 196name the client wishes to contact to complete the operation.
 197The request MUST NOT contain additional query parameters.
 198
 199   C: GET $GIT_URL/info/refs?service=git-upload-pack HTTP/1.0
 200
 201dumb server reply:
 202
 203   S: 200 OK
 204   S:
 205   S: 95dcfa3633004da0049d3d0fa03f80589cbcaf31  refs/heads/maint
 206   S: d049f6c27a2244e12041955e262a404c7faba355  refs/heads/master
 207   S: 2cb58b79488a98d2721cea644875a8dd0026b115  refs/tags/v1.0
 208   S: a3c2e2402b99163d1d59756e5f207ae21cccba4c  refs/tags/v1.0^{}
 209
 210smart server reply:
 211
 212   S: 200 OK
 213   S: Content-Type: application/x-git-upload-pack-advertisement
 214   S: Cache-Control: no-cache
 215   S:
 216   S: 001e# service=git-upload-pack\n
 217   S: 0000
 218   S: 004895dcfa3633004da0049d3d0fa03f80589cbcaf31 refs/heads/maint\0multi_ack\n
 219   S: 0042d049f6c27a2244e12041955e262a404c7faba355 refs/heads/master\n
 220   S: 003c2cb58b79488a98d2721cea644875a8dd0026b115 refs/tags/v1.0\n
 221   S: 003fa3c2e2402b99163d1d59756e5f207ae21cccba4c refs/tags/v1.0^{}\n
 222   S: 0000
 223
 224The client may send Extra Parameters (see
 225Documentation/technical/pack-protocol.txt) as a colon-separated string
 226in the Git-Protocol HTTP header.
 227
 228Dumb Server Response
 229^^^^^^^^^^^^^^^^^^^^
 230Dumb servers MUST respond with the dumb server reply format.
 231
 232See the prior section under dumb clients for a more detailed
 233description of the dumb server response.
 234
 235Smart Server Response
 236^^^^^^^^^^^^^^^^^^^^^
 237If the server does not recognize the requested service name, or the
 238requested service name has been disabled by the server administrator,
 239the server MUST respond with the `403 Forbidden` HTTP status code.
 240
 241Otherwise, smart servers MUST respond with the smart server reply
 242format for the requested service name.
 243
 244Cache-Control headers SHOULD be used to disable caching of the
 245returned entity.
 246
 247The Content-Type MUST be `application/x-$servicename-advertisement`.
 248Clients SHOULD fall back to the dumb protocol if another content
 249type is returned.  When falling back to the dumb protocol clients
 250SHOULD NOT make an additional request to `$GIT_URL/info/refs`, but
 251instead SHOULD use the response already in hand.  Clients MUST NOT
 252continue if they do not support the dumb protocol.
 253
 254Clients MUST validate the status code is either `200 OK` or
 255`304 Not Modified`.
 256
 257Clients MUST validate the first five bytes of the response entity
 258matches the regex `^[0-9a-f]{4}#`.  If this test fails, clients
 259MUST NOT continue.
 260
 261Clients MUST parse the entire response as a sequence of pkt-line
 262records.
 263
 264Clients MUST verify the first pkt-line is `# service=$servicename`.
 265Servers MUST set $servicename to be the request parameter value.
 266Servers SHOULD include an LF at the end of this line.
 267Clients MUST ignore an LF at the end of the line.
 268
 269Servers MUST terminate the response with the magic `0000` end
 270pkt-line marker.
 271
 272The returned response is a pkt-line stream describing each ref and
 273its known value.  The stream SHOULD be sorted by name according to
 274the C locale ordering.  The stream SHOULD include the default ref
 275named `HEAD` as the first ref.  The stream MUST include capability
 276declarations behind a NUL on the first ref.
 277
 278The returned response contains "version 1" if "version=1" was sent as an
 279Extra Parameter.
 280
 281  smart_reply     =  PKT-LINE("# service=$servicename" LF)
 282                     "0000"
 283                     *1("version 1")
 284                     ref_list
 285                     "0000"
 286  ref_list        =  empty_list / non_empty_list
 287
 288  empty_list      =  PKT-LINE(zero-id SP "capabilities^{}" NUL cap-list LF)
 289
 290  non_empty_list  =  PKT-LINE(obj-id SP name NUL cap_list LF)
 291                     *ref_record
 292
 293  cap-list        =  capability *(SP capability)
 294  capability      =  1*(LC_ALPHA / DIGIT / "-" / "_")
 295  LC_ALPHA        =  %x61-7A
 296
 297  ref_record      =  any_ref / peeled_ref
 298  any_ref         =  PKT-LINE(obj-id SP name LF)
 299  peeled_ref      =  PKT-LINE(obj-id SP name LF)
 300                     PKT-LINE(obj-id SP name "^{}" LF
 301
 302
 303Smart Service git-upload-pack
 304------------------------------
 305This service reads from the repository pointed to by `$GIT_URL`.
 306
 307Clients MUST first perform ref discovery with
 308`$GIT_URL/info/refs?service=git-upload-pack`.
 309
 310   C: POST $GIT_URL/git-upload-pack HTTP/1.0
 311   C: Content-Type: application/x-git-upload-pack-request
 312   C:
 313   C: 0032want 0a53e9ddeaddad63ad106860237bbf53411d11a7\n
 314   C: 0032have 441b40d833fdfa93eb2908e52742248faf0ee993\n
 315   C: 0000
 316
 317   S: 200 OK
 318   S: Content-Type: application/x-git-upload-pack-result
 319   S: Cache-Control: no-cache
 320   S:
 321   S: ....ACK %s, continue
 322   S: ....NAK
 323
 324Clients MUST NOT reuse or revalidate a cached response.
 325Servers MUST include sufficient Cache-Control headers
 326to prevent caching of the response.
 327
 328Servers SHOULD support all capabilities defined here.
 329
 330Clients MUST send at least one "want" command in the request body.
 331Clients MUST NOT reference an id in a "want" command which did not
 332appear in the response obtained through ref discovery unless the
 333server advertises capability `allow-tip-sha1-in-want` or
 334`allow-reachable-sha1-in-want`.
 335
 336  compute_request   =  want_list
 337                       have_list
 338                       request_end
 339  request_end       =  "0000" / "done"
 340
 341  want_list         =  PKT-LINE(want NUL cap_list LF)
 342                       *(want_pkt)
 343  want_pkt          =  PKT-LINE(want LF)
 344  want              =  "want" SP id
 345  cap_list          =  *(SP capability) SP
 346
 347  have_list         =  *PKT-LINE("have" SP id LF)
 348
 349TODO: Document this further.
 350
 351The Negotiation Algorithm
 352~~~~~~~~~~~~~~~~~~~~~~~~~
 353The computation to select the minimal pack proceeds as follows
 354(C = client, S = server):
 355
 356'init step:'
 357
 358C: Use ref discovery to obtain the advertised refs.
 359
 360C: Place any object seen into set `advertised`.
 361
 362C: Build an empty set, `common`, to hold the objects that are later
 363   determined to be on both ends.
 364
 365C: Build a set, `want`, of the objects from `advertised` the client
 366   wants to fetch, based on what it saw during ref discovery.
 367
 368C: Start a queue, `c_pending`, ordered by commit time (popping newest
 369   first).  Add all client refs.  When a commit is popped from
 370   the queue its parents SHOULD be automatically inserted back.
 371   Commits MUST only enter the queue once.
 372
 373'one compute step:'
 374
 375C: Send one `$GIT_URL/git-upload-pack` request:
 376
 377   C: 0032want <want #1>...............................
 378   C: 0032want <want #2>...............................
 379   ....
 380   C: 0032have <common #1>.............................
 381   C: 0032have <common #2>.............................
 382   ....
 383   C: 0032have <have #1>...............................
 384   C: 0032have <have #2>...............................
 385   ....
 386   C: 0000
 387
 388The stream is organized into "commands", with each command
 389appearing by itself in a pkt-line.  Within a command line,
 390the text leading up to the first space is the command name,
 391and the remainder of the line to the first LF is the value.
 392Command lines are terminated with an LF as the last byte of
 393the pkt-line value.
 394
 395Commands MUST appear in the following order, if they appear
 396at all in the request stream:
 397
 398* "want"
 399* "have"
 400
 401The stream is terminated by a pkt-line flush (`0000`).
 402
 403A single "want" or "have" command MUST have one hex formatted
 404SHA-1 as its value.  Multiple SHA-1s MUST be sent by sending
 405multiple commands.
 406
 407The `have` list is created by popping the first 32 commits
 408from `c_pending`.  Less can be supplied if `c_pending` empties.
 409
 410If the client has sent 256 "have" commits and has not yet
 411received one of those back from `s_common`, or the client has
 412emptied `c_pending` it SHOULD include a "done" command to let
 413the server know it won't proceed:
 414
 415   C: 0009done
 416
 417S: Parse the git-upload-pack request:
 418
 419Verify all objects in `want` are directly reachable from refs.
 420
 421The server MAY walk backwards through history or through
 422the reflog to permit slightly stale requests.
 423
 424If no "want" objects are received, send an error:
 425TODO: Define error if no "want" lines are requested.
 426
 427If any "want" object is not reachable, send an error:
 428TODO: Define error if an invalid "want" is requested.
 429
 430Create an empty list, `s_common`.
 431
 432If "have" was sent:
 433
 434Loop through the objects in the order supplied by the client.
 435
 436For each object, if the server has the object reachable from
 437a ref, add it to `s_common`.  If a commit is added to `s_common`,
 438do not add any ancestors, even if they also appear in `have`.
 439
 440S: Send the git-upload-pack response:
 441
 442If the server has found a closed set of objects to pack or the
 443request ends with "done", it replies with the pack.
 444TODO: Document the pack based response
 445
 446   S: PACK...
 447
 448The returned stream is the side-band-64k protocol supported
 449by the git-upload-pack service, and the pack is embedded into
 450stream 1.  Progress messages from the server side MAY appear
 451in stream 2.
 452
 453Here a "closed set of objects" is defined to have at least
 454one path from every "want" to at least one "common" object.
 455
 456If the server needs more information, it replies with a
 457status continue response:
 458TODO: Document the non-pack response
 459
 460C: Parse the upload-pack response:
 461   TODO: Document parsing response
 462
 463'Do another compute step.'
 464
 465
 466Smart Service git-receive-pack
 467------------------------------
 468This service reads from the repository pointed to by `$GIT_URL`.
 469
 470Clients MUST first perform ref discovery with
 471`$GIT_URL/info/refs?service=git-receive-pack`.
 472
 473   C: POST $GIT_URL/git-receive-pack HTTP/1.0
 474   C: Content-Type: application/x-git-receive-pack-request
 475   C:
 476   C: ....0a53e9ddeaddad63ad106860237bbf53411d11a7 441b40d833fdfa93eb2908e52742248faf0ee993 refs/heads/maint\0 report-status
 477   C: 0000
 478   C: PACK....
 479
 480   S: 200 OK
 481   S: Content-Type: application/x-git-receive-pack-result
 482   S: Cache-Control: no-cache
 483   S:
 484   S: ....
 485
 486Clients MUST NOT reuse or revalidate a cached response.
 487Servers MUST include sufficient Cache-Control headers
 488to prevent caching of the response.
 489
 490Servers SHOULD support all capabilities defined here.
 491
 492Clients MUST send at least one command in the request body.
 493Within the command portion of the request body clients SHOULD send
 494the id obtained through ref discovery as old_id.
 495
 496  update_request  =  command_list
 497                     "PACK" <binary data>
 498
 499  command_list    =  PKT-LINE(command NUL cap_list LF)
 500                     *(command_pkt)
 501  command_pkt     =  PKT-LINE(command LF)
 502  cap_list        =  *(SP capability) SP
 503
 504  command         =  create / delete / update
 505  create          =  zero-id SP new_id SP name
 506  delete          =  old_id SP zero-id SP name
 507  update          =  old_id SP new_id SP name
 508
 509TODO: Document this further.
 510
 511
 512References
 513----------
 514
 515http://www.ietf.org/rfc/rfc1738.txt[RFC 1738: Uniform Resource Locators (URL)]
 516http://www.ietf.org/rfc/rfc2616.txt[RFC 2616: Hypertext Transfer Protocol -- HTTP/1.1]
 517link:technical/pack-protocol.html
 518link:technical/protocol-capabilities.html