Documentation / technical / pack-heuristics.txton commit sha1_file: allow prepare_alt_odb to handle arbitrary repositories (13068bf)
   1Concerning Git's Packing Heuristics
   2===================================
   3
   4        Oh, here's a really stupid question:
   5
   6                  Where do I go
   7               to learn the details
   8            of Git's packing heuristics?
   9
  10Be careful what you ask!
  11
  12Followers of the Git, please open the Git IRC Log and turn to
  13February 10, 2006.
  14
  15It's a rare occasion, and we are joined by the King Git Himself,
  16Linus Torvalds (linus).  Nathaniel Smith, (njs`), has the floor
  17and seeks enlightenment.  Others are present, but silent.
  18
  19Let's listen in!
  20
  21    <njs`> Oh, here's a really stupid question -- where do I go to
  22        learn the details of Git's packing heuristics?  google avails
  23        me not, reading the source didn't help a lot, and wading
  24        through the whole mailing list seems less efficient than any
  25        of that.
  26
  27It is a bold start!  A plea for help combined with a simultaneous
  28tri-part attack on some of the tried and true mainstays in the quest
  29for enlightenment.  Brash accusations of google being useless. Hubris!
  30Maligning the source.  Heresy!  Disdain for the mailing list archives.
  31Woe.
  32
  33    <pasky> yes, the packing-related delta stuff is somewhat
  34        mysterious even for me ;)
  35
  36Ah!  Modesty after all.
  37
  38    <linus> njs, I don't think the docs exist. That's something where
  39         I don't think anybody else than me even really got involved.
  40         Most of the rest of Git others have been busy with (especially
  41         Junio), but packing nobody touched after I did it.
  42
  43It's cryptic, yet vague.  Linus in style for sure.  Wise men
  44interpret this as an apology.  A few argue it is merely a
  45statement of fact.
  46
  47    <njs`> I guess the next step is "read the source again", but I
  48        have to build up a certain level of gumption first :-)
  49
  50Indeed!  On both points.
  51
  52    <linus> The packing heuristic is actually really really simple.
  53
  54Bait...
  55
  56    <linus> But strange.
  57
  58And switch.  That ought to do it!
  59
  60    <linus> Remember: Git really doesn't follow files. So what it does is
  61        - generate a list of all objects
  62        - sort the list according to magic heuristics
  63        - walk the list, using a sliding window, seeing if an object
  64          can be diffed against another object in the window
  65        - write out the list in recency order
  66
  67The traditional understatement:
  68
  69    <njs`> I suspect that what I'm missing is the precise definition of
  70        the word "magic"
  71
  72The traditional insight:
  73
  74    <pasky> yes
  75
  76And Babel-like confusion flowed.
  77
  78    <njs`> oh, hmm, and I'm not sure what this sliding window means either
  79
  80    <pasky> iirc, it appeared to me to be just the sha1 of the object
  81        when reading the code casually ...
  82
  83        ... which simply doesn't sound as a very good heuristics, though ;)
  84
  85    <njs`> .....and recency order.  okay, I think it's clear I didn't
  86       even realize how much I wasn't realizing :-)
  87
  88Ah, grasshopper!  And thus the enlightenment begins anew.
  89
  90    <linus> The "magic" is actually in theory totally arbitrary.
  91        ANY order will give you a working pack, but no, it's not
  92        ordered by SHA-1.
  93
  94        Before talking about the ordering for the sliding delta
  95        window, let's talk about the recency order. That's more
  96        important in one way.
  97
  98    <njs`> Right, but if all you want is a working way to pack things
  99        together, you could just use cat and save yourself some
 100        trouble...
 101
 102Waaait for it....
 103
 104    <linus> The recency ordering (which is basically: put objects
 105        _physically_ into the pack in the order that they are
 106        "reachable" from the head) is important.
 107
 108    <njs`> okay
 109
 110    <linus> It's important because that's the thing that gives packs
 111        good locality. It keeps the objects close to the head (whether
 112        they are old or new, but they are _reachable_ from the head)
 113        at the head of the pack. So packs actually have absolutely
 114        _wonderful_ IO patterns.
 115
 116Read that again, because it is important.
 117
 118    <linus> But recency ordering is totally useless for deciding how
 119        to actually generate the deltas, so the delta ordering is
 120        something else.
 121
 122        The delta ordering is (wait for it):
 123        - first sort by the "basename" of the object, as defined by
 124          the name the object was _first_ reached through when
 125          generating the object list
 126        - within the same basename, sort by size of the object
 127        - but always sort different types separately (commits first).
 128
 129        That's not exactly it, but it's very close.
 130
 131    <njs`> The "_first_ reached" thing is not too important, just you
 132        need some way to break ties since the same objects may be
 133        reachable many ways, yes?
 134
 135And as if to clarify:
 136
 137    <linus> The point is that it's all really just any random
 138        heuristic, and the ordering is totally unimportant for
 139        correctness, but it helps a lot if the heuristic gives
 140        "clumping" for things that are likely to delta well against
 141        each other.
 142
 143It is an important point, so secretly, I did my own research and have
 144included my results below.  To be fair, it has changed some over time.
 145And through the magic of Revisionistic History, I draw upon this entry
 146from The Git IRC Logs on my father's birthday, March 1:
 147
 148    <gitster> The quote from the above linus should be rewritten a
 149        bit (wait for it):
 150        - first sort by type.  Different objects never delta with
 151          each other.
 152        - then sort by filename/dirname.  hash of the basename
 153          occupies the top BITS_PER_INT-DIR_BITS bits, and bottom
 154          DIR_BITS are for the hash of leading path elements.
 155        - then if we are doing "thin" pack, the objects we are _not_
 156          going to pack but we know about are sorted earlier than
 157          other objects.
 158        - and finally sort by size, larger to smaller.
 159
 160In one swell-foop, clarification and obscurification!  Nonetheless,
 161authoritative.  Cryptic, yet concise.  It even solicits notions of
 162quotes from The Source Code.  Clearly, more study is needed.
 163
 164    <gitster> That's the sort order.  What this means is:
 165        - we do not delta different object types.
 166        - we prefer to delta the objects with the same full path, but
 167          allow files with the same name from different directories.
 168        - we always prefer to delta against objects we are not going
 169          to send, if there are some.
 170        - we prefer to delta against larger objects, so that we have
 171          lots of removals.
 172
 173        The penultimate rule is for "thin" packs.  It is used when
 174        the other side is known to have such objects.
 175
 176There it is again. "Thin" packs.  I'm thinking to myself, "What
 177is a 'thin' pack?"  So I ask:
 178
 179    <jdl> What is a "thin" pack?
 180
 181    <gitster> Use of --objects-edge to rev-list as the upstream of
 182        pack-objects.  The pack transfer protocol negotiates that.
 183
 184Woo hoo!  Cleared that _right_ up!
 185
 186    <gitster> There are two directions - push and fetch.
 187
 188There!  Did you see it?  It is not '"push" and "pull"'!  How often the
 189confusion has started here.  So casually mentioned, too!
 190
 191    <gitster> For push, git-send-pack invokes git-receive-pack on the
 192        other end.  The receive-pack says "I have up to these commits".
 193        send-pack looks at them, and computes what are missing from
 194        the other end.  So "thin" could be the default there.
 195
 196        In the other direction, fetch, git-fetch-pack and
 197        git-clone-pack invokes git-upload-pack on the other end
 198        (via ssh or by talking to the daemon).
 199
 200        There are two cases: fetch-pack with -k and clone-pack is one,
 201        fetch-pack without -k is the other.  clone-pack and fetch-pack
 202        with -k will keep the downloaded packfile without expanded, so
 203        we do not use thin pack transfer.  Otherwise, the generated
 204        pack will have delta without base object in the same pack.
 205
 206        But fetch-pack without -k will explode the received pack into
 207        individual objects, so we automatically ask upload-pack to
 208        give us a thin pack if upload-pack supports it.
 209
 210OK then.
 211
 212Uh.
 213
 214Let's return to the previous conversation still in progress.
 215
 216    <njs`> and "basename" means something like "the tail of end of
 217        path of file objects and dir objects, as per basename(3), and
 218        we just declare all commit and tag objects to have the same
 219        basename" or something?
 220
 221Luckily, that too is a point that gitster clarified for us!
 222
 223If I might add, the trick is to make files that _might_ be similar be
 224located close to each other in the hash buckets based on their file
 225names.  It used to be that "foo/Makefile", "bar/baz/quux/Makefile" and
 226"Makefile" all landed in the same bucket due to their common basename,
 227"Makefile". However, now they land in "close" buckets.
 228
 229The algorithm allows not just for the _same_ bucket, but for _close_
 230buckets to be considered delta candidates.  The rationale is
 231essentially that files, like Makefiles, often have very similar
 232content no matter what directory they live in.
 233
 234    <linus> I played around with different delta algorithms, and with
 235        making the "delta window" bigger, but having too big of a
 236        sliding window makes it very expensive to generate the pack:
 237        you need to compare every object with a _ton_ of other objects.
 238
 239        There are a number of other trivial heuristics too, which
 240        basically boil down to "don't bother even trying to delta this
 241        pair" if we can tell before-hand that the delta isn't worth it
 242        (due to size differences, where we can take a previous delta
 243        result into account to decide that "ok, no point in trying
 244        that one, it will be worse").
 245
 246        End result: packing is actually very size efficient. It's
 247        somewhat CPU-wasteful, but on the other hand, since you're
 248        really only supposed to do it maybe once a month (and you can
 249        do it during the night), nobody really seems to care.
 250
 251Nice Engineering Touch, there.  Find when it doesn't matter, and
 252proclaim it a non-issue.  Good style too!
 253
 254    <njs`> So, just to repeat to see if I'm following, we start by
 255        getting a list of the objects we want to pack, we sort it by
 256        this heuristic (basically lexicographically on the tuple
 257        (type, basename, size)).
 258
 259        Then we walk through this list, and calculate a delta of
 260        each object against the last n (tunable parameter) objects,
 261        and pick the smallest of these deltas.
 262
 263Vastly simplified, but the essence is there!
 264
 265    <linus> Correct.
 266
 267    <njs`> And then once we have picked a delta or fulltext to
 268        represent each object, we re-sort by recency, and write them
 269        out in that order.
 270
 271    <linus> Yup. Some other small details:
 272
 273And of course there is the "Other Shoe" Factor too.
 274
 275    <linus> - We limit the delta depth to another magic value (right
 276        now both the window and delta depth magic values are just "10")
 277
 278    <njs`> Hrm, my intuition is that you'd end up with really _bad_ IO
 279        patterns, because the things you want are near by, but to
 280        actually reconstruct them you may have to jump all over in
 281        random ways.
 282
 283    <linus> - When we write out a delta, and we haven't yet written
 284        out the object it is a delta against, we write out the base
 285        object first.  And no, when we reconstruct them, we actually
 286        get nice IO patterns, because:
 287        - larger objects tend to be "more recent" (Linus' law: files grow)
 288        - we actively try to generate deltas from a larger object to a
 289          smaller one
 290        - this means that the top-of-tree very seldom has deltas
 291          (i.e. deltas in _practice_ are "backwards deltas")
 292
 293Again, we should reread that whole paragraph.  Not just because
 294Linus has slipped Linus's Law in there on us, but because it is
 295important.  Let's make sure we clarify some of the points here:
 296
 297    <njs`> So the point is just that in practice, delta order and
 298        recency order match each other quite well.
 299
 300    <linus> Yes. There's another nice side to this (and yes, it was
 301        designed that way ;):
 302        - the reason we generate deltas against the larger object is
 303          actually a big space saver too!
 304
 305    <njs`> Hmm, but your last comment (if "we haven't yet written out
 306        the object it is a delta against, we write out the base object
 307        first"), seems like it would make these facts mostly
 308        irrelevant because even if in practice you would not have to
 309        wander around much, in fact you just brute-force say that in
 310        the cases where you might have to wander, don't do that :-)
 311
 312    <linus> Yes and no. Notice the rule: we only write out the base
 313        object first if the delta against it was more recent.  That
 314        means that you can actually have deltas that refer to a base
 315        object that is _not_ close to the delta object, but that only
 316        happens when the delta is needed to generate an _old_ object.
 317
 318    <linus> See?
 319
 320Yeah, no.  I missed that on the first two or three readings myself.
 321
 322    <linus> This keeps the front of the pack dense. The front of the
 323        pack never contains data that isn't relevant to a "recent"
 324        object.  The size optimization comes from our use of xdelta
 325        (but is true for many other delta algorithms): removing data
 326        is cheaper (in size) than adding data.
 327
 328        When you remove data, you only need to say "copy bytes n--m".
 329        In contrast, in a delta that _adds_ data, you have to say "add
 330        these bytes: 'actual data goes here'"
 331
 332    *** njs` has quit: Read error: 104 (Connection reset by peer)
 333
 334    <linus> Uhhuh. I hope I didn't blow njs` mind.
 335
 336    *** njs` has joined channel #git
 337
 338    <pasky> :)
 339
 340The silent observers are amused.  Of course.
 341
 342And as if njs` was expected to be omniscient:
 343
 344    <linus> njs - did you miss anything?
 345
 346OK, I'll spell it out.  That's Geek Humor.  If njs` was not actually
 347connected for a little bit there, how would he know if missed anything
 348while he was disconnected?  He's a benevolent dictator with a sense of
 349humor!  Well noted!
 350
 351    <njs`> Stupid router.  Or gremlins, or whatever.
 352
 353It's a cheap shot at Cisco.  Take 'em when you can.
 354
 355    <njs`> Yes and no. Notice the rule: we only write out the base
 356        object first if the delta against it was more recent.
 357
 358        I'm getting lost in all these orders, let me re-read :-)
 359        So the write-out order is from most recent to least recent?
 360        (Conceivably it could be the opposite way too, I'm not sure if
 361        we've said) though my connection back at home is logging, so I
 362        can just read what you said there :-)
 363
 364And for those of you paying attention, the Omniscient Trick has just
 365been detailed!
 366
 367    <linus> Yes, we always write out most recent first
 368
 369    <njs`> And, yeah, I got the part about deeper-in-history stuff
 370        having worse IO characteristics, one sort of doesn't care.
 371
 372    <linus> With the caveat that if the "most recent" needs an older
 373        object to delta against (hey, shrinking sometimes does
 374        happen), we write out the old object with the delta.
 375
 376    <njs`> (if only it happened more...)
 377
 378    <linus> Anyway, the pack-file could easily be denser still, but
 379        because it's used both for streaming (the Git protocol) and
 380        for on-disk, it has a few pessimizations.
 381
 382Actually, it is a made-up word. But it is a made-up word being
 383used as setup for a later optimization, which is a real word:
 384
 385    <linus> In particular, while the pack-file is then compressed,
 386        it's compressed just one object at a time, so the actual
 387        compression factor is less than it could be in theory. But it
 388        means that it's all nice random-access with a simple index to
 389        do "object name->location in packfile" translation.
 390
 391    <njs`> I'm assuming the real win for delta-ing large->small is
 392        more homogeneous statistics for gzip to run over?
 393
 394        (You have to put the bytes in one place or another, but
 395        putting them in a larger blob wins on compression)
 396
 397        Actually, what is the compression strategy -- each delta
 398        individually gzipped, the whole file gzipped, somewhere in
 399        between, no compression at all, ....?
 400
 401        Right.
 402
 403Reality IRC sets in.  For example:
 404
 405    <pasky> I'll read the rest in the morning, I really have to go
 406        sleep or there's no hope whatsoever for me at the today's
 407        exam... g'nite all.
 408
 409Heh.
 410
 411    <linus> pasky: g'nite
 412
 413    <njs`> pasky: 'luck
 414
 415    <linus> Right: large->small matters exactly because of compression
 416        behaviour. If it was non-compressed, it probably wouldn't make
 417        any difference.
 418
 419    <njs`> yeah
 420
 421    <linus> Anyway: I'm not even trying to claim that the pack-files
 422        are perfect, but they do tend to have a nice balance of
 423        density vs ease-of use.
 424
 425Gasp!  OK, saved.  That's a fair Engineering trade off.  Close call!
 426In fact, Linus reflects on some Basic Engineering Fundamentals,
 427design options, etc.
 428
 429    <linus> More importantly, they allow Git to still _conceptually_
 430        never deal with deltas at all, and be a "whole object" store.
 431
 432        Which has some problems (we discussed bad huge-file
 433        behaviour on the Git lists the other day), but it does mean
 434        that the basic Git concepts are really really simple and
 435        straightforward.
 436
 437        It's all been quite stable.
 438
 439        Which I think is very much a result of having very simple
 440        basic ideas, so that there's never any confusion about what's
 441        going on.
 442
 443        Bugs happen, but they are "simple" bugs. And bugs that
 444        actually get some object store detail wrong are almost always
 445        so obvious that they never go anywhere.
 446
 447    <njs`> Yeah.
 448
 449Nuff said.
 450
 451    <linus> Anyway.  I'm off for bed. It's not 6AM here, but I've got
 452         three kids, and have to get up early in the morning to send
 453         them off. I need my beauty sleep.
 454
 455    <njs`> :-)
 456
 457    <njs`> appreciate the infodump, I really was failing to find the
 458        details on Git packs :-)
 459
 460And now you know the rest of the story.