Apr 272008

In my pursuit to understand Git, it’s been helpful for me to understand it from the bottom up — rather than look at it only in terms of its high-level commands. And since Git is so beautifully simple when viewed this way, I thought others might be interested to read what I’ve found, and perhaps avoid the pain I went through finding it.

The following article offers what I’ve learned on this journey so far. I hope it can help others to comprehend this wonderful system, and discover some of the joy I’ve experienced in the past few weeks. NOTE: After receiving more than fifty corrections by e-mail from very helpful readers, I’ve updated the PDF to reflect their input. The date at the front should read “December 2009″ if you have the latest version.

Here is a summary from the table of contents:

  • Introduction
  • Repository: Directory content tracking
  • Introducing the blob
  • Blobs are stored in trees
  • How trees are made
  • The beauty of commits
  • A commit by any other name…
  • Branching and the power of rebase
  • Index Cache: Meet the middle man
  • Taking the index cache farther
  • To reset, or not to reset
  • Last links in the chain: Stashing and the reflog
 Posted by at 6:32 pm

  58 Responses to “Git from the bottom up”

  1. Your link to the pdf file is broken when accessed via feedreader.



  2. Please check to see if the link is fixed now.

    • Hi!
      I was looking for some information about git and I’ve found your article.
      It’s really good.

      Also the latex template is wuandefull and easy readable. It is possible to sent me your latex template.
      Thank you

  3. awesome, I look forward to reading it :)

  4. I read the PDF and it was helpful in solidifying my mental image of git and introducing me to other tools. Thank you.

    I think an interesting explanation to add would be blob injection, that is, two or more HEADs that have no shared history between them. I do that when I want to add meta-info about a cloned upstream repository.

    It doesn’t help me understand if git knows about “moving a function from one file to another”? Or where (or if) compression and differencing comes into play. Perhaps these are details not necessary for this level of understanding.

    Your git-stash usage as a backup system sounds novel, but I suppose it doesn’t track files that are not registered (untracked, like new files).

    Perhaps nitpicks: Is there an extraneous quote in the title or is this some special font for “t”? “Git’ …”? On page 25, “they’re” is broken across lines incorrectly.

  5. John,

    Thanks for this resource: as a newcomer to Git, the PDF was a very comprehensible and eye opening read.

    I spotted one very minor grammar mistake: on p24 it should be “This approach has two distinct advantages” rather than “This approach two distinct advantages”.

    I was a bit confused by your use of HEAD@{1} notation in the section on hard resets until I came to the section on stashing. Perhaps the material on recovering from an inadvertent hard reset could be moved into the stashing section?

    Thanks again!

  6. @piyo I’m not sure I follow what you mean about “blob injection”, could you share an example?

    Also, you’re right, git-stash does not track unregistered files.

    The curly “t” is a “stylistic alternate” within the OpenType font Garamond Premier Pro.

    I’ve added yours and Max’s grammatical corrections. Thank you!

    @Max I added a short note on the usage of HEAD@{1}, that it will be explained in the next section. I just thought it important to point out there so that people would connect it with the idea of restoring from an accidental reset.

  7. Hey the pdf _looks_ very nice, how did you produce it?

    Do you know if there is something similar to this for mercurial?

    thanks in advance

  8. @MB The PDF was written using the word processor Mellel, the font Garamond Premier Pro from Adobe, and the drawing program OmniGraffle Pro for the diagrams.

    I know of nothing similar for Mercurial, but I haven’t really looked either.

  9. Regarding what I call “blob injection”, it is a way of adding blobs that are unrelated to the current repository’s history. In other words, adding another track of blobs with no relation between the existing blobs.

    Actually, you’ve already touch on that, with your CVS and Subversion import example in “Diving into Git”. However, you speedily sow these two tracks together with git-rebase. I however, keep the tracks separate, because I use it for separate data, like where I got this repository from and how it should be checked out. This is a limited technique probably only suitable for metadata.

    An example:

    $ git clone git://repo.or.cz/git.git
    # or any other repo
    $ mkdir checkout_info && cd checkout_info
    $ git init
    $ echo “This repository is from git://repo.or.cz/git.git” > .git_checkout_info
    $ git add .git_checkout_info && git commit -m “Checkout info”
    $ git tag checkout_info
    $ cd ../git
    $ git remote add checkout_info ../checkout_info/.git
    $ git fetch checkout_info # this is where we insert unrelated info.
    $ rm ../checkout_info # no longer needed
    $ gitk –all -d # notice there is nothing connecting the two tracks together.

  10. Ah, I actually use this feature too, to keep “side-band” data which is related to the repository, but doesn’t belong in any regular working tree.

    I could have added a note about it in the PDF, but I think it might have added a bit more complexity than necessary. This is the kind of thing for you to blog about so I can link there! :)

  11. Would it be possible to see the article in a format better suited to on-screen reading, like HTML or plain text? Maybe just the words without the pictures?

  12. I have no easy way to convert it to HTML at present.

  13. Excellent document even for people accustomed to Git, in order to have a deeper understanding of the beast.

    Keep up the good work!

  14. Thank you very much for writing this. It really increased my understanding of git.

  15. Thank you for this paper, it was very useful to me!

    I was wondering if you could license your work under a cc license (or something similar) so I could translate and share your work in italian language.

  16. Masci, consider it done. Just e-mail me so that I’m certain to get done what you need for the translation to happen.

  17. Very well written paper, thanks a lot. I was very glad to see that a lot of the needs of developers are taking into account by GIT.

    The funny stuff is that I have written a pattern language about CM and not all but most techniques are covered by GIT. If you would like to have a look into my paper, just drop me an e-mail.

    Conerning you expression about the “the index”, instead I would like to call this a “temporary repository” which might be a better fit for its itention. But any way now it is too late.

    However from my point of view “Staging area” might not be realy covered by “the index”. I see a staging area devloped with the use of a multi dimensional file system (UnionFS, or aufs) because then the time required to share a change with other developers is

    1) independent of the size of the code

    2) indepedent of the number of developers

    I significant adavantage for high speed developments, I have used something like for over a century.

    best regards


  18. Thank you very much for the work.

    Nice and very well thought. :-)

  19. Thanks John

    Very helpful!

  20. Your bottom-up tutorial helped me grok git’s guts in an easy (and entertaining) way.

    Thank you a lot for taking the time to put together this quality document.

  21. Thank you so much! This document was what I needed to actually get git. I knew it was good but without the understanding of how it works, it was a real confusion whenever anything went a bit wrong.

    Thank you.

  22. Great exposition; many thanks.

    I do have one question, though. In “Branching and the power of rebase”, you note that “the “base” of the Z branch is A, while the base of the D branch continues back to some unlabeled commit in the past”. I’m probably being dense, but I don’t see how/why the “base”(s) of these two branches should be different: Why does the Z branch ‘stop’ at A when the D branch doesn’t?

  23. David, technically speaking you are right, they can both be viewed as independent branches with the same ancestor commit. I suppose I meant that Z stops at A only for the sake of considering Z a “branch off of A”. I’ll see about rewording it.

  24. Excellent writing, thanks!

    The url for the git-core tutorial has changed slightly – it is now at http://www.kernel.org/pub/software/scm/git/docs/gitcore-tutorial.html

  25. Thanks for the update. I’ll include this among the next round of edits.

  26. Resources

    Git Cheatsheet

  27. Thank you very much for this; as a bottom-up guy it made me understand git better than dozens of other tutorials. A small correction:

    > This means that when you create a tree from your index and store it under a commit (all of which is done by commit), you are also, inadvertently adding that commit to the reflog, which can be viewed using the following command:

    You got an extra comma after “also”.

  28. Hi John,

    a comment to the paragraph about reset:

    $ git reset HEAD foo.c

    actually isn’t a mixed reset. If you specify a path, HEAD is never touched.


    $ git reset HEAD~3

    makes a (mixed) reset to HEAD~3, so your HEAD changes.

    $ git reset HEAD~3 foo.c

    only changes the index entry for foo.c though.

  29. Hello John

    Above, you say “The date at the front should read “Fri, 2 May 2008” if you have the latest version.”

    I took this to mean that the file http://ftp.newartisans.com/pub/git.from.bottom.up.pdf
    would carry that date.
    However, the file downloaded today contains “Thu, 11 Sep 2008″ just after the title.

    I am confused as to which or where is the latest version of your document.

  30. Thanks for letting me know, I’ll try to rectify the situation this week, as well as incorporating several recent corrections that were sent in.

  31. this is an excellent write up; I’ve been reading much into git and i found this to be conceptually useful and pleasing to the visual whole of my brain.

    i am in the process on converting my company to git, and i intend on using git hooks to sync merges on specific branches to our dev server to ALL distributed production servers (web servers) transparently, verify syntax in certain files, and update our custom internal ticketing system. we will be able to develop, sync to staging, and sync to production without ever leaving the dev server.

    thank you for your investment into this document; i expect many others will find it as useful as i have.

  32. You deserve a lot of praise for this article. It’s well written and looks great. Far too few authors care about aesthetics although it makes a text so much easier to read and follow.

    I will give this to my interested co-workers.

    Thank you!

  33. John, thanks for the work you put into this. Since I am coming from SVN, I’m really struggling to “git” git, and this approach is usually what works better for me than tutorials and the like.

    One issue that is confusing me. You defined several terms at the beginning, but did not define “commit”. Coming from SVN, there appears to be a subtle but important difference. In paticular, this, from page six:

    “This first commit added my greeting file to the repository. It contains one Git

    is ambiguous. On first reading, I took “it” to refer to the repository, but subsequent statements are inconsistent with that, and seem to point toward “it” referring to the commit. Which further implies that a commit is more a piece of data than an action, or simply a change in the state of the single repository tree as it is in SVN.

    I’m moving on with a big asterisk next to my tentative understanding of what a commit actually is in the system.

  34. Great article, indeed.

    @Kyle: I’m a git newbie and SVN convert too, but looking at http://eagain.net/articles/git-for-computer-scientists/ I guess that each commit has one tree.

  35. Hi John,
    I just started to read your GFTBU, it’s so very well written and has such a good intro, really is the right way things should be taught. Brilliant. Just a feedback… thanks.-

  36. This is wonderful, thank you so much for writing it!!!

  37. [...] artisans’ Git from the Bottom up Explains the hashes, trees and blobs that Git is built [...]

  38. I read the PDF version twice, and still can learn something new. This article introduces Git from a special and interesting perspective, which allow users (who are of course programmers) to quickly get the core of Git philosophy, rather than getting confused by tons of Git commands.

    Brilliant! Thank you for sharing! I cannot wait recommending this article to my friends and collegues again and again.

  39. John,

    I have been reading your article on the bus into town each day. I just want to say a big thanks for taking the time to share your knowledge.

    You’re a great writer, and you have made git much less scary for me !

    In fact, I shall be shutting down my svn server and have started moving my artifacts into github.

    I like your tips such as the one at the end on using stash regularly.

    Keep up the good work. I shall keep an eye on your work for other gems to pop up.


  40. Thank you for writing this excellent paper. For me as a developer it was truly wonderful to find a Git article which describes the theory behind Git using simple illustrations and everyday examples. The theory behind trees, commits, blobs and other Git concepts is both important and non-trivial but this paper manages to explain them in a calm and thorough manner.

  41. It’s great to have an concise overview of Git wich we’ve been using for a couple of years now. Sometimes the high level commands make you loose focus of the basics, which this PDF shows in a very clear way.

  42. I’m trying to translate your article but i can’t understand some sentences.
    So I have email to you about this questions I encountered.
    Thanks if you can reply.

  43. Thank you for the great work! It would be even better if mobile format(s) of your work, such as EPUB or MOBI, be made available.

    • I don’t know of a really good way to turn PDFs into a document that’s great for mobile readers, and since most mobile readers can view PDFs one way or another…

  44. thanks a lot for this very good paper. do you know if there is an article translated in french, because I would be interesting to do it if you agree to let me do it.

  45. [...] 4. Git from the bottom up [...]

  46. Hi John,
    do you think it would be possible for me to translate your nice article into the Czech language?
    If so, I would ask you for some source files needed.
    This translation would be a nice complementary text to TortoiseHg tutorial already translated by me
    ( http://tortoisehg.bitbucket.org/manual/2.4-cs/index.html ).
    Regards, Tovim

  47. [...] from the bottom up” if free online e-book : http://newartisans.com/2008/04/git-from-the-bottom-up/ which is also available in PDF form for download and offline [...]

  48. Hello,

    It was great read !
    I have translated the PDF version of “Git from the bottom up” into Korean language and posted on my site


    If you could put the link, somewhere in your post, it will be good help for Korean developers who wish to read your post without language barrier.

    Thank you so much.

  49. [...] control with Git. If you don’t know what Git or distribute version control is, I suggest to read this article. Or you can try it [...]

 Leave a Reply



You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>