This is an exploration into what is going on when I run some basic git commands. We start out by creating a new repository. git/object is the directory where git stores all its objects, and it is empty initially.
$ mkdir myrepo $ cd myrepo/ $ git init Initialized empty Git repository in /Users/andersjanmyr/tmp/myrepo/.git/ $ find .git/objects -type f # find all files in .git/objects $
When a file is added to git it gets stored in the .git/objects directory under the name of its hash. The first two characters of the hash is used as the name of a subdirectory and the rest become the file name. Worth noting is that the hash uniquely identifies its content, so were you to run the commands on your computer, your results should be identical.
$ echo "A tapir has 14 toes" > tapir.txt $ git add tapir.txt $ find .git/objects -type f .git/objects/12/a93608760777f50380a94b52e1b54ec69f4743 $ git hash-object tapir.txt 12a93608760777f50380a94b52e1b54ec69f4743
If you try to list the contents of the file, you are out of luck since it is stored in a binary format, you should instead use the git command git cat-file. The file above is a blob and its contents is what can be expected.
$ cat .git/objects/12/a93608760777f50380a94b52e1b54ec69f4743 xK??OR02`pT(I,?,R?H,V04Q(?O-?zi$ $ $ git cat-file -t 12a93608760777f50380a94b52e1b54ec69f4743 blob $ git cat-file blob 12a936 # Using the first part of the hash is enough A tapir has 14 toes
Even though the file is in the .git/objects directory it is not committed yet and it cannot be read by the high-level git commands such as git log. git status on the other hand will show that the file is staged, or in the index.
$ git log fatal: bad default revision 'HEAD' $ git status # On branch master # # Initial commit # # Changes to be committed: # (use "git rm --cached <file>..." to unstage) # # new file: tapir.txt #
When I commit the file, two more objects are added to the .git/objects directory
$ git commit -m "added tapir file" [master (root-commit) 7e38e95] added tapir file 1 files changed, 1 insertions(+), 0 deletions(-) create mode 100644 tapir.txt $ find .git/objects/ -type f .git/objects//12/a93608760777f50380a94b52e1b54ec69f4743 .git/objects//7e/38e95d328287ea9d234a2affc4ed9e4510435a .git/objects//e8/493a7e63154350f8c3d08a42e759132d9d2a39
One is tree object and the other is a commit.
$ git cat-file -t 7e38 commit $ git cat-file -t e849 tree $
The commit contains the information that was recorded when I committed. Apart from the commit message and my personal info it contains a reference to the tree object that was created simultaneously with the commit.
$ git cat-file commit 7e38 tree e8493a7e63154350f8c3d08a42e759132d9d2a39 author Anders Janmyr <anders.janmyr@jayway.se> 1253590540 +0200 committer Anders Janmyr <anders.janmyr@jayway.se> 1253590540 +0200 added tapir file $
The tree object is stored in binary format and cannot be completely read without the help of git ls-tree. Now I can see that it contains a reference to the blob that was created initially, the tapir.txt file.
$ git cat-file tree e8493a7e63154350f8c3d08a42e759132d9d2a39 100644 tapir.txt?vw???KR?NƟGC$ $ git ls-tree e8493a7e63154350f8c3d08a42e759132d9d2a39 100644 blob 12a93608760777f50380a94b52e1b54ec69f4743 tapir.txt $
So how does git know what is the latests commit? In git lingo the latest commit is know as the HEAD. If I look inside .git/HEAD I see a reference and this reference points to the latest commit.
$ cat ./.git/HEAD ref: refs/heads/master $ cat ./.git/refs/heads/master 7e38e95d328287ea9d234a2affc4ed9e4510435a
The .git/refs directory is where all the references of git live, heads and tags.
$ find .git/refs .git/refs .git/refs/heads .git/refs/heads/master .git/refs/tags $ git branch olle $ find .git/refs .git/refs .git/refs/heads .git/refs/heads/master .git/refs/heads/olle .git/refs/tags
Creating a new branch with git branch shows that the branch is added to the heads directory, switching to it will change the .git/HEAD contents.
$ cat ./.git/HEAD ref: refs/heads/master $ git co olle Switched to branch 'olle' $ cat ./.git/HEAD ref: refs/heads/olle
Git, simple, but beautiful!
Thanks a lot for this article ! you helped me to recover my corrupted GIT repository !
ReplyDeletenote : I tipped 1$ on your article with the service tiptheweb.org
I'm glad it helped you and thanks for the tip!
ReplyDeleteexcellent article, much appreciated. Bookmarked and I'll be back...
ReplyDelete@anonymous, You're welcome, glad you liked it.
ReplyDeleteA worthy read indeed, will come back for more interesting Git articles I see you have plenty of. :-)
ReplyDelete