Wednesday, January 25, 2012

Finding with Git

Git is an amazing version control system that never loses anything, but sometimes it can be hard to find out where things are. Most of the time it is going to be git log that is our friend, but not all the time.

Where is my file?

Sometimes you know that you have a file in your repository but, you don't know exactly where it is. git ls-files is the answer.

# Find all files with the name security in the path.
$ git ls-files | grep security
lib/dynamo-db/security.js
test/security-test.js

Obviously, you can use any particular grep options you prefer, like -i to ignore case.

In what files does word exist?

If you want to find information inside files git grep is your friend. git grep works similar to a recursive grep grep -r . but it only searches files that are managed by git.

# Find all lines matching *crypt* in current version.
$ git grep crypt
lib/dynamo-db/security.js:var crypto = require('crypto');
lib/dynamo-db/security.js:   var hmac = crypto.createHmac('sha256', key);

# Also give me the line numbers
 git grep -n crypt
lib/dynamo-db/security.js:2:var crypto = require('crypto');
lib/dynamo-db/security.js:15:   var hmac = crypto.createHmac('sha256', key);

# List only the file names
 git grep -l crypt
lib/dynamo-db/security.js

# Also list how many times (count) it matched.
$ git grep -c crypt
lib/dynamo-db/security.js:2

It is also possible to give versions to grep to find out what has changed between revisions.

# Find all files with lines matching *type* in revisions `master` and `8f0fb7f`.
git grep -l type  master 8f0fb7f
master:lib/dynamo-db/index.js
master:lib/dynamo-db/security.js
master:package.json
8f0fb7f:lib/dynamo-db/index.js
8f0fb7f:package.json

Maybe this is not that impressive. Most of the above can be accomplished with standard grep, find, ack, and friends. But Git is a version control system. How do I find out about things that happened in the past?

Who deleted my file?

You know how it is, you are working in some project and, your dog gets sick and you have to stay home from work, when you come back someone has deleted your file! Where is it and who did it? git log to the rescue.

git log shows you the commit logs. It is your eye into the past.

# When (in what commit) was my file deleted?
$ git log --diff-filter=D -- test/create-table-test.js
commit ba6c4d8bc165b8fb8208979c3e5513bd53477d51
Author: Anders Janmyr <anders@janmyr.com>
Date:   Wed Jan 25 09:46:52 2012 +0100

    Removed the stupid failing test.

Looks like I found the culprit, was I working from home? But is the file really deleted here. To get some more information about files add the --summary option.

# When (in what commit) was my file deleted?
$ git log --diff-filter=D --summary -- test/create-table-test.js
commit ba6c4d8bc165b8fb8208979c3e5513bd53477d51
Author: Anders Janmyr <anders@janmyr.com>
Date:   Wed Jan 25 09:46:52 2012 +0100

    Removed the stupid failing test.

 delete mode 100644 test/create-table-test.js

Yes, it looks like the file is really deleted. Stupid bastard! Let me break this command down starting with the last command.

  • test/create-table-test.js - The filename has to be the relative path to the file (from your current directory).
  • -- - The double-dash is used to tell Git that this is not a branch or an option.
  • --summary - Show me what files were deleted or added. --name-status is similar.
  • --diff-filter - This is a real beauty, it allows me to limit the log to show me only the specified kind of change, in this case D, for Deleted. Other options are: Added (A), Copied (C), Modified (M) and Renamed (R)

When was a file added?

This uses the same technique as above, but I will vary it since I don't want to type the full path of the file.

# Find out when the integration tests where added.
$ git log --diff-filter=A --name-status |grep -C 6 integ
commit 09420cfea8c7b569cd47f690104750fec358a10a
Author: Anders Janmyr <anders@janmyr.com>
Date:   Tue Jan 24 16:23:52 2012 +0100

    Extracted integration test

A integration-test/sts-test.js

commit 205db3965dec6c2c4c7b2bb75387a591d49e1951
Author: Anders Janmyr <anders@janmyr.com>
Date:   Sat Jan 21 10:03:59 2012 +0100

As you can see here I am using --name-status as a variation on --summary, it uses the same notation as the --diff-filter.

I am using grep -C 6 to get some context around the found element, in this case six lines before and after the match. Very useful!

Who changed that line?

As you probably know it is to see who has done something in a file by using git blame

$ git blame test/security-test.js
205db396 (Anders Janmyr 2012-01-21 10:03:59 +0100  1) var vows = require('vows'),
205db396 (Anders Janmyr 2012-01-21 10:03:59 +0100  2)     assert = require('assert');
205db396 (Anders Janmyr 2012-01-21 10:03:59 +0100  3) 
09420cfe (Anders Janmyr 2012-01-24 16:23:52 +0100  4) var access = 'access';
09420cfe (Anders Janmyr 2012-01-24 16:23:52 +0100  5) var secret = 'secret';
90b65208 (Anders Janmyr 2012-01-21 11:58:21 +0100  6) 
205db396 (Anders Janmyr 2012-01-21 10:03:59 +0100  7) var security = new Security({
90b65208 (Anders Janmyr 2012-01-21 11:58:21 +0100  8)   access: access,
90b65208 (Anders Janmyr 2012-01-21 11:58:21 +0100  9)   secret: secret
205db396 (Anders Janmyr 2012-01-21 10:03:59 +0100 10) });
...

Every line gets annotated with the commit who introduced it and by whom. Very helpful.

Who deleted that line?

Another feature, which is not as well known, is git blame --reverse. It allows you to see the file as it was before, annotated to show you where it has been changed.

# Check the what lines have been changed since the last 6 commits.
$ git blame --reverse head~6..head security-test.js
558b8e7f (Anders Janmyr 2012-01-25 16:53:52 +0100  1) var vows = require('vows'),
558b8e7f (Anders Janmyr 2012-01-25 16:53:52 +0100  2)     assert = require('assert');
558b8e7f (Anders Janmyr 2012-01-25 16:53:52 +0100  3) 
093c13e9 (Anders Janmyr 2012-01-24 17:26:09 +0100  4) var Security = require('dynamo-db').Security;
ba6c4d8b (Anders Janmyr 2012-01-25 09:46:52 +0100  5) 
^b96c68b (Anders Janmyr 2012-01-21 12:33:50 +0100  6) var access = process.env['S3_KEY'];
^b96c68b (Anders Janmyr 2012-01-21 12:33:50 +0100  7) var secret = process.env['S3_SECRET'];
558b8e7f (Anders Janmyr 2012-01-25 16:53:52 +0100  8) 
558b8e7f (Anders Janmyr 2012-01-25 16:53:52 +0100  9) var security = new Security({
558b8e7f (Anders Janmyr 2012-01-25 16:53:52 +0100 10)   access: access,
558b8e7f (Anders Janmyr 2012-01-25 16:53:52 +0100 11)   secret: secret
558b8e7f (Anders Janmyr 2012-01-25 16:53:52 +0100 12) });

In the output you can see that most of the lines are still the same at HEAD (558b8e7f). But the fourth and fifth line 093c13e9 and ba6c4d8b don't exist anymore. And the sixth and seventh lines ^b96c68b have been changed after this commit.

What commits contain the string?

Another thing I find very useful is to find out when certain words or sentences are removed or added. For this you can use git log -S<string> or git log -G<regex>

# Find commits that modified the string aws and display the full diff.
$ git log -Saws --diff-filter=M --patch
commit b96c68b839f204b310b79570bc3d27dc93cff588
Author: Anders Janmyr <anders@janmyr.com>
Date:   Sat Jan 21 12:33:50 2012 +0100

    We have a valid request, tjohoo

diff --git a/lib/dynamo-db/security.js b/lib/dynamo-db/security.js
index bee6936..8471527 100644
--- a/lib/dynamo-db/security.js
+++ b/lib/dynamo-db/security.js
@@ -2,6 +2,7 @@
 var crypto = require('crypto');
 var _ = require('underscore');
 var request = require("request");
+var xml2js = require('xml2js');
 
 function Security(options) {
   this.options = options;
@@ -23,7 +24,7 @@ mod.timestamp = function() {
 mod.defaultParams = function() {
   return {
     AWSAccessKeyId: this.options.access,
-    Version: '2010-05-08',
+    Version: '2011-06-15',
     Timestamp: this.timestamp(),
     SignatureVersion: 2,
     SignatureMethod: 'HmacSHA256'
@@ -57,9 +58,10 @@ mod.url = function(host, path, params) {
 
 mod.makeRequest = function(method, host, path, params, callback) {
   var extParams = _.extend({}, this.defaultParams(), params);
-  var signedParams = this.signedParams('GET', 'iam.amazonaws.com', '/', extParams);
-  console.log(extParams,signedParams);
-  return request({ method: method, url: this.url(host, path, signedParams) },
+  var signedParams = this.signedParams(method, host, path, extParams);
+  var url = this.url(host, path, signedParams);
+  console.log(url,signedParams);
+  return request({ method: method, url: url },
...

That's all folks!