Documentation / cvs-migration.txton commit [PATCH] fetch.c: Make process() look at each object only once (a82d07e)
   1Git for CVS users
   2=================
   3v0.99.5, Aug 2005
   4
   5Ok, so you're a CVS user. That's ok, it's a treatable condition, and the
   6first step to recovery is admitting you have a problem. The fact that
   7you are reading this file means that you may be well on that path
   8already.
   9
  10The thing about CVS is that it absolutely sucks as a source control
  11manager, and you'll thus be happy with almost anything else. Git,
  12however, may be a bit 'too' different (read: "good") for your taste, and
  13does a lot of things differently. 
  14
  15One particular suckage of CVS is very hard to work around: CVS is
  16basically a tool for tracking 'file' history, while git is a tool for
  17tracking 'project' history.  This sometimes causes problems if you are
  18used to doing very strange things in CVS, in particular if you're doing
  19things like making branches of just a subset of the project.  Git can't
  20track that, since git never tracks things on the level of an individual
  21file, only on the whole project level. 
  22
  23The good news is that most people don't do that, and in fact most sane
  24people think it's a bug in CVS that makes it tag (and check in changes)
  25one file at a time.  So most projects you'll ever see will use CVS
  26'as if' it was sane.  In which case you'll find it very easy indeed to
  27move over to Git. 
  28
  29First off: this is not a git tutorial. See
  30link:tutorial.html[Documentation/tutorial.txt] for how git
  31actually works. This is more of a random collection of gotcha's
  32and notes on converting from CVS to git.
  33
  34Second: CVS has the notion of a "repository" as opposed to the thing
  35that you're actually working in (your working directory, or your
  36"checked out tree").  Git does not have that notion at all, and all git
  37working directories 'are' the repositories.  However, you can easily
  38emulate the CVS model by having one special "global repository", which
  39people can synchronize with.  See details later, but in the meantime
  40just keep in mind that with git, every checked out working tree will
  41have a full revision control history of its own.
  42
  43
  44Importing a CVS archive
  45-----------------------
  46
  47Ok, you have an old project, and you want to at least give git a chance
  48to see how it performs. The first thing you want to do (after you've
  49gone through the git tutorial, and generally familiarized yourself with
  50how to commit stuff etc in git) is to create a git'ified version of your
  51CVS archive.
  52
  53Happily, that's very easy indeed. Git will do it for you, although git
  54will need the help of a program called "cvsps":
  55
  56        http://www.cobite.com/cvsps/
  57
  58which is not actually related to git at all, but which makes CVS usage
  59look almost sane (ie you almost certainly want to have it even if you
  60decide to stay with CVS). However, git will want 'at least' version 2.1
  61of cvsps (available at the address above), and in fact will currently
  62refuse to work with anything else.
  63
  64Once you've gotten (and installed) cvsps, you may or may not want to get
  65any more familiar with it, but make sure it is in your path. After that,
  66the magic command line is
  67
  68        git cvsimport -v -d <cvsroot> -C <destination> <module>
  69
  70which will do exactly what you'd think it does: it will create a git
  71archive of the named CVS module. The new archive will be created in the
  72subdirectory named <destination>; it'll be created if it doesn't exist.
  73Default is the local directory.
  74
  75It can take some time to actually do the conversion for a large archive
  76since it involves checking out from CVS every revision of every file,
  77and the conversion script is reasonably chatty unless you omit the '-v'
  78option, but on some not very scientific tests it averaged about twenty
  79revisions per second, so a medium-sized project should not take more
  80than a couple of minutes.  For larger projects or remote repositories,
  81the process may take longer.
  82
  83After the (initial) import is done, the CVS archive's current head
  84revision will be checked out -- thus, you can start adding your own
  85changes right away.
  86
  87The import is incremental, i.e. if you call it again next month it'll
  88fetch any CVS updates that have been happening in the meantime. The
  89cut-off is date-based, so don't change the branches that were imported
  90from CVS.
  91
  92You can merge those updates (or, in fact, a different CVS branch) into
  93your main branch:
  94
  95        git resolve HEAD origin "merge with current CVS HEAD"
  96
  97The HEAD revision from CVS is named "origin", not "HEAD", because git
  98already uses "HEAD". (If you don't like 'origin', use cvsimport's
  99'-o' option to change it.)
 100
 101
 102Emulating CVS behaviour
 103-----------------------
 104
 105
 106So, by now you are convinced you absolutely want to work with git, but
 107at the same time you absolutely have to have a central repository.
 108Step back and think again. Okay, you still need a single central
 109repository? There are several ways to go about that:
 110
 1111. Designate a person responsible to pull all branches. Make the
 112repository of this person public, and make every team member
 113pull regularly from it.
 114
 1152. Set up a public repository with read/write access for every team
 116member. Use "git pull/push" as you used "cvs update/commit".  Be
 117sure that your repository is up to date before pushing, just
 118like you used to do with "cvs commit"; your push will fail if
 119what you are pushing is not up to date.
 120
 1213. Make the repository of every team member public. It is the
 122responsibility of each single member to pull from every other
 123team member.
 124
 125
 126CVS annotate
 127------------
 128
 129So, something has gone wrong, and you don't know whom to blame, and
 130you're an ex-CVS user and used to do "cvs annotate" to see who caused
 131the breakage. You're looking for the "git annotate", and it's just
 132claiming not to find such a script. You're annoyed.
 133
 134Yes, that's right.  Core git doesn't do "annotate", although it's
 135technically possible, and there are at least two specialized scripts out
 136there that can be used to get equivalent information (see the git
 137mailing list archives for details). 
 138
 139Git has a couple of alternatives, though, that you may find sufficient
 140or even superior depending on your use.  One is called "git-whatchanged"
 141(for obvious reasons) and the other one is called "pickaxe" ("a tool for
 142the software archeologist"). 
 143
 144The "git-whatchanged" script is a truly trivial script that can give you
 145a good overview of what has changed in a file or a directory (or an
 146arbitrary list of files or directories).  The "pickaxe" support is an
 147additional layer that can be used to further specify exactly what you're
 148looking for, if you already know the specific area that changed.
 149
 150Let's step back a bit and think about the reason why you would
 151want to do "cvs annotate a-file.c" to begin with.
 152
 153You would use "cvs annotate" on a file when you have trouble
 154with a function (or even a single "if" statement in a function)
 155that happens to be defined in the file, which does not do what
 156you want it to do.  And you would want to find out why it was
 157written that way, because you are about to modify it to suit
 158your needs, and at the same time you do not want to break its
 159current callers.  For that, you are trying to find out why the
 160original author did things that way in the original context.
 161
 162Many times, it may be enough to see the commit log messages of
 163commits that touch the file in question, possibly along with the
 164patches themselves, like this:
 165
 166        $ git-whatchanged -p a-file.c
 167
 168This will show log messages and patches for each commit that
 169touches a-file.
 170
 171This, however, may not be very useful when this file has many
 172modifications that are not related to the piece of code you are
 173interested in.  You would see many log messages and patches that
 174do not have anything to do with the piece of code you are
 175interested in.  As an example, assuming that you have this piece
 176of code that you are interested in in the HEAD version:
 177
 178        if (frotz) {
 179                nitfol();
 180        }
 181
 182you would use git-rev-list and git-diff-tree like this:
 183
 184        $ git-rev-list HEAD |
 185          git-diff-tree --stdin -v -p -S'if (frotz) {
 186                nitfol();
 187        }'
 188
 189We have already talked about the "\--stdin" form of git-diff-tree
 190command that reads the list of commits and compares each commit
 191with its parents.  The git-whatchanged command internally runs
 192the equivalent of the above command, and can be used like this:
 193
 194        $ git-whatchanged -p -S'if (frotz) {
 195                nitfol();
 196        }'
 197
 198When the -S option is used, git-diff-tree command outputs
 199differences between two commits only if one tree has the
 200specified string in a file and the corresponding file in the
 201other tree does not.  The above example looks for a commit that
 202has the "if" statement in it in a file, but its parent commit
 203does not have it in the same shape in the corresponding file (or
 204the other way around, where the parent has it and the commit
 205does not), and the differences between them are shown, along
 206with the commit message (thanks to the -v flag).  It does not
 207show anything for commits that do not touch this "if" statement.
 208
 209Also, in the original context, the same statement might have
 210appeared at first in a different file and later the file was
 211renamed to "a-file.c".  CVS annotate would not help you to go
 212back across such a rename, but GIT would still help you in such
 213a situation.  For that, you can give the -C flag to
 214git-diff-tree, like this:
 215
 216        $ git-whatchanged -p -C -S'if (frotz) {
 217                nitfol();
 218        }'
 219
 220When the -C flag is used, file renames and copies are followed.
 221So if the "if" statement in question happens to be in "a-file.c"
 222in the current HEAD commit, even if the file was originally
 223called "o-file.c" and then renamed in an earlier commit, or if
 224the file was created by copying an existing "o-file.c" in an
 225earlier commit, you will not lose track.  If the "if" statement
 226did not change across such a rename or copy, then the commit that
 227does rename or copy would not show in the output, and if the
 228"if" statement was modified while the file was still called
 229"o-file.c", it would find the commit that changed the statement
 230when it was in "o-file.c".
 231
 232[ BTW, the current versions of "git-diff-tree -C" is not eager
 233  enough to find copies, and it will miss the fact that a-file.c
 234  was created by copying o-file.c unless o-file.c was somehow
 235  changed in the same commit.]
 236
 237You can use the --pickaxe-all flag in addition to the -S flag.
 238This causes the differences from all the files contained in
 239those two commits, not just the differences between the files
 240that contain this changed "if" statement:
 241
 242        $ git-whatchanged -p -C -S'if (frotz) {
 243                nitfol();
 244        }' --pickaxe-all
 245
 246[ Side note.  This option is called "--pickaxe-all" because -S
 247  option is internally called "pickaxe", a tool for software
 248  archaeologists.]