Store git activity in MySQL with PHP

07/26/2010

Git hooks are saving me so much time and providing me with interesting solutions to problems I didn't even know I had. I can't be the only person who this would be useful to, so give it a go.

As I said, I work on loads of sites, and keeping track of what's been done and where is sometimes a bit of a pain. I keep a todo list, but if I get an emergency email from someone, chances are that won't go through my todos. It will, however, be put into version control.

So this morning I had the bright idea to write a git hook that pushes relevant information to MySQL so that I can run activity reports later. All my bare git repositories are stored in a directory on our dedi, so it's just a matter of making sure each repository has the post-receive hook in. I do this by keeping the actual hook in the same directory as all my repositories, then symlinking the hook into the appropriate place with the following little script. Obviously, this assumes that your post-receive hook is in the same place as your repositories, and that you want this hook everywhere. But that's all true, so we're all good. Once you've run the linked script, you'll only have one hook to maintain and every time you create a new repository, you can just run the script again and everything will all be up-to-date.

Now for the hook. It's not beautiful PHP, but little scripts like this rarely are, in my experience.

Create this table:

CREATE TABLE `log` (
`id` int(10) unsigned NOT NULL auto_increment,
`repo` varchar(255) NOT NULL,
`commit` varchar(40) NOT NULL,
`date` datetime NOT NULL,
`message` text NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `commit` (`commit`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

Here's your script. chmod +x it.

#!/usr/bin/env php <?php
date_default_timezone_set('Europe/London');
exec('pwd',$pwd);
$repo = rtrim(array_shift($pwd),'/');
$repo = substr($repo,strrpos($repo,'/') + 1);
$db = new PDO('mysql:dbname=DB;host=127.0.0.1','USERNAME','PASSWORD');
exec('git log --all --pretty=format:"%H%n%ct%n%s%n%b%n<><><>"',$capture,$log);
if ($capture){
// preprocess the log
$commits = array();
$current = array();
foreach ($capture as $row){
if (trim($row) === '<><><>') {
$commits[] = $current;
$current = array();
} else {
$current[] = $row;
}
}
$v = array();
$b = array();
foreach ($commits as $commit){
$sha = $commit[0];
$m = $commit[2] . (trim($commit[3]) === '' ? '' : "\n\n" . implode("\n",array_slice($commit,3)));
$d = date('Y-m-d H:i:s',$commit[1]);
$v[] = '(?,?,?,?)';
$b[] = $repo;
$b[] = $sha;
$b[] = $m;
$b[] = $d;
}
$stmt = $db->prepare('insert ignore into log (repo,commit,message,`date`) values' . implode(',',$v));
try {
if ($stmt) {
if (!$stmt->execute($b)) throw new PDOException;;
} else Throw new PDOException;
} catch (PDOException $e) {
mail('EMAIL','Commit did not reach db',$e->getMessage());
}
}
?>

So basically we're extracting the log data we need, doing some funky stuff to handle multi-line commit messages (I like to store lots of details as my subject messages tend to be a bit vague!). Other than that, if you're familiar with PHP, the above should be pretty self-explanatory. If it's not, hit the comments and I'll explain things.

I've only been using this a little while, but it seems to work very well. If you use it and stumble across any bugs, I'd love to know about them!

Update: I've today realised that git log only logs the currently-selected branch, or master on a bare repo so I've added the --all switch to git log so I can get the logs for every branch. Most of it's just "Merged blah" but that means it can be filtered easily and I'd rather have everything and need to filter than be missing something important.