Thursday, September 24, 2009

Git undo, reset or revert?

If you have found this page you probably came here since you wanted to clear your working directory from all the changes that you have made.

The simple answer is:

# Clear working directory tree from all changes
$ git checkout -f HEAD

This is, however, not the best way to do it. A better way is:

# Clears the working directory tree, and stashes all the changes.
$ git stash


git stash allows you to get your changes back any time you need them in case you change your mind. It is also possible to inspect and manipulate the stashes.

# List all the stashes
$ git stash list
stash@{0}: WIP on admin_ui: 0c1a80a Removed annotation from JdbcAdminService, it is now explicity initialized in the applicationContext.
stash@{1}: WIP on admin_ui: 14e12e6 Added foreign keys for UserRole
stash@{2}: WIP on master: d188ecd Merge branch 'master' of semc-git:customercare
stash@{3}: WIP on master: 3763795 More work on user_details.
...

# Apply the latest stash, and remove it from the stack
$ git stash pop

# Apply a named patch, but leave it on the stack
$ git stash apply stash@{2} 

# Drop a stash
$ git stash drop stash@{3} 

# Clear the entire stash stack (almost never needed)
$ git stash clear

# A better way to purge the stash
$ git reflog expire --expire=30.days refs/stash

What about git reset then, it sounds like it should do about the same as git co -f HEAD. It doesn't. git reset is used for setting the current reference pointer, HEAD.

# Reset the latest commit, and leave the changes in the index.
$ git reset --soft HEAD^

# Reset the latest commit, and leave the changes in the working directory
$ git reset HEAD^

# Undo add, move the changes from the index to the working directory
$ git reset

# Reset the latest successful pull or merge
$ git reset --hard ORIG_HEAD

# Reset the latest failed pull or merge
$ git reset --hard

# Reset the latest pull or merge, into a dirty working tree
$ git reset --merge ORIG_HEAD

You can do more things with reset, but the above covers the typical cases. And now to the last thing, git revert. What does it do? git revert creates a new commit that is the opposite of the commit it names.

# Show the commits
$ git log --oneline
4717a5c new line
7e38e95 added tapir file

# Revert the commit named, 4717a5c, and commit it.
$ git revert 4717a5c

# Revert the HEAD commit, but don't commit it
$ git revert -n HEAD

Git is incredibly flexible and lets you control everything if you want to.

Tuesday, September 22, 2009

Inside Git

This is an exploration into what is going on when I run some basic git commands. We start out by creating a new repository. git/object is the directory where git stores all its objects, and it is empty initially.

$ mkdir myrepo
$ cd myrepo/
$ git init
Initialized empty Git repository in /Users/andersjanmyr/tmp/myrepo/.git/
$ find .git/objects -type f     # find all files in .git/objects
$ 

When a file is added to git it gets stored in the .git/objects directory under the name of its hash. The first two characters of the hash is used as the name of a subdirectory and the rest become the file name. Worth noting is that the hash uniquely identifies its content, so were you to run the commands on your computer, your results should be identical.

$ echo "A tapir has 14 toes" > tapir.txt
$ git add tapir.txt
$ find .git/objects -type f
.git/objects/12/a93608760777f50380a94b52e1b54ec69f4743
$ git hash-object tapir.txt
12a93608760777f50380a94b52e1b54ec69f4743

If you try to list the contents of the file, you are out of luck since it is stored in a binary format, you should instead use the git command git cat-file. The file above is a blob and its contents is what can be expected.

$ cat .git/objects/12/a93608760777f50380a94b52e1b54ec69f4743
xK??OR02`pT(I,?,R?H,V04Q(?O-?zi$ 
$
$ git cat-file -t 12a93608760777f50380a94b52e1b54ec69f4743
blob
$ git cat-file blob 12a936   # Using the first part of the hash is enough
A tapir has 14 toes
 

Even though the file is in the .git/objects directory it is not committed yet and it cannot be read by the high-level git commands such as git log. git status on the other hand will show that the file is staged, or in the index.

$ git log
fatal: bad default revision 'HEAD'
$ git status
# On branch master
#
# Initial commit
#
# Changes to be committed:
#   (use "git rm --cached <file>..." to unstage)
#
# new file:   tapir.txt
#

When I commit the file, two more objects are added to the .git/objects directory

$ git commit -m "added tapir file"
[master (root-commit) 7e38e95] added tapir file
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 tapir.txt
$ find .git/objects/ -type f
.git/objects//12/a93608760777f50380a94b52e1b54ec69f4743
.git/objects//7e/38e95d328287ea9d234a2affc4ed9e4510435a
.git/objects//e8/493a7e63154350f8c3d08a42e759132d9d2a39

One is tree object and the other is a commit.

$ git cat-file -t 7e38
commit
$ git cat-file -t e849
tree
$

The commit contains the information that was recorded when I committed. Apart from the commit message and my personal info it contains a reference to the tree object that was created simultaneously with the commit.

$ git cat-file commit  7e38
tree e8493a7e63154350f8c3d08a42e759132d9d2a39
author Anders Janmyr <anders.janmyr@jayway.se> 1253590540 +0200
committer Anders Janmyr <anders.janmyr@jayway.se> 1253590540 +0200

added tapir file
$ 

The tree object is stored in binary format and cannot be completely read without the help of git ls-tree. Now I can see that it contains a reference to the blob that was created initially, the tapir.txt file.

$ git cat-file tree e8493a7e63154350f8c3d08a42e759132d9d2a39
100644 tapir.txt?vw???KR?NƟGC$ 
$ git ls-tree e8493a7e63154350f8c3d08a42e759132d9d2a39
100644 blob 12a93608760777f50380a94b52e1b54ec69f4743 tapir.txt
$

So how does git know what is the latests commit? In git lingo the latest commit is know as the HEAD. If I look inside .git/HEAD I see a reference and this reference points to the latest commit.

$  cat ./.git/HEAD
ref: refs/heads/master
$ cat ./.git/refs/heads/master
7e38e95d328287ea9d234a2affc4ed9e4510435a

The .git/refs directory is where all the references of git live, heads and tags.

$ find .git/refs
.git/refs
.git/refs/heads
.git/refs/heads/master
.git/refs/tags
$ git branch olle
$ find .git/refs
.git/refs
.git/refs/heads
.git/refs/heads/master
.git/refs/heads/olle
.git/refs/tags

Creating a new branch with git branch shows that the branch is added to the heads directory, switching to it will change the .git/HEAD contents.

$  cat ./.git/HEAD
ref: refs/heads/master
$ git co olle
Switched to branch 'olle'
$  cat ./.git/HEAD
ref: refs/heads/olle

Git, simple, but beautiful!