In 2017 CWI and Google published the first found collision for the SHA-1 hash function. This immediately sparked discussions about the security of Git because its beautiful data model relies heavily on SHA-1.

The shattered website states:

GIT strongly relies on SHA-1 for the identification and integrity checking of all file objects and commits. It is essentially possible to create two GIT repositories with the same head commit hash and different contents, say a benign source code and a backdoored one. An attacker could potentially selectively serve either repository to targeted users. This will require attackers to compute their own collision.

Several arguments have been brought up, why this shouldn’t be of concern. The argument I have read most frequently was, that Git does not rely on the hash for security purposes. The protection mechanism of a repository is the authorization system of the web-based git front ends, such as GitHub and GitLab. However, there have been breaches in the past, where attackers were able to inject malicious code into a popular repository, see for example this Gentoo incident.

Another argument is that since Git version 2.13, Git uses the hardened version of SHA-1 which automatically detects if the computed hash is exploitable. The issue with this argument is that there are still many old Git version around. Ubuntu 16 ships with version 2.7, CentOS 7 ships with version 1.8 (however this special RedHat version might include a backport with the hardened SHA-1).

Another popular argument is that Git uses cryptographic signatures to ensure the authenticity of the repository. I usually sign my commits, so weaknesses of the hash function shouldn’t affect me, as long as the cryptographic signatures are strong, right? Let’s have a look at this argument in more depth.


To start, let’s create a repository and add a signed commit with git commit -S. You can check that the signature is valid.

$ git log --show-signature
commit 45dbb6197441d74a70590cfdede084d0b96c4c68
gpg: Signature made Do 18 Okt 2018 11:41:05 CEST using RSA key ID 40264985
gpg: Good signature from "Frank Sauerburger <frank@sauerburger.com>" [ultimate]
Author: Frank Sauerburger <frank@sauerburger.com>
Date:   Thu Oct 18 11:40:56 2018 +0200

    Add README with mission statement

With a bit of Git plumbing, we can get the pretty-printed commit text.

$ git cat-file -p 45dbb6197441d74a70590cfdede084d0b96c4c68
tree c394d5d8f3a23e0b4b28107da6cb626053a90036
author Frank Sauerburger <frank@sauerburger.com> 1539855656 +0200
committer Frank Sauerburger <frank@sauerburger.com> 1539855656 +0200
gpgsig -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJbyFUxAAoJEOJxPX5AJkmFLwYQAIZh4yXLYwc5OCXhrhuTGVTY
 h9jQOpMHzrhWqoBP5wrhnnSINl8/i+2HMgIy70/H1hkrPW3Tg6Ps11vkJattjaU7
 zXvyUEcBp4kH+csFHJpX/h6LTTbaqjeV/Hn5HNux/taLW7hJlweTK7O+T3JWosBB
 RJXucyNVW6aNKTGipKKTU2ACCFLGF0h4U23l6VPdEBzeW3HLIlCZ5uV2lKmGXyKr
 EnW8QB8YoRl5LS7CmzfMjVY1lI/+GBl4A1rmwjxyRv1m+I7a+XQIOjY6I/w2AChk
 KkB2ruXYw8nnSeyULlxawbEFuV7PmM0lFX3RvOZI7mDva7LPCgLq5xBQqf9wL47a
 SSpMgO+IxD0Etc+kSWBYggolZ2zaE6qmNVCDWt9enx5ApZZWOR3QfOZSfLIKfr4w
 x2MrbhybwZd2u27xqzu5fWF7iRzPjRdDVqvTRSp79giGzNKlGak9TZ5g9VXXvNb6
 DvR8vDprhXXpcrQ7OtKsHgcVolrg+hQ9anmNjjbVcDmRZ6x61SWs951hY9C3OTx8
 bsFvX+3xFAMayc4D3o2B96KN5zOOoNAABBiOXXOxTB5nnYhaXH4TsvIfnVmBQ8ex
 xaN2+y9SdMyFxLuRVKDsdAc3AjWE7aXCSU3dIWrHO54l9HGDQJ+1Tzbfs/u3akjn
 /ul1KZ4+aHgoji3LF/yU
 =NZ+T
 -----END PGP SIGNATURE-----

Add README with mission statement

At this point we need to dissect this output a bit to see what text is actually signed.

Removing the gpgsig part of the message and saving it to a file called commit, gives us the unsigned commit text. By removing the gpgsig keyword and leading spaces from the signatures part, we can save the detached signature to the file commit.asc.

$ cat commit
tree c394d5d8f3a23e0b4b28107da6cb626053a90036
author Frank Sauerburger <frank@sauerburger.com> 1539855656 +0200
committer Frank Sauerburger <frank@sauerburger.com> 1539855656 +0200

Add README with mission statement
$ cat commit.asc
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIcBAABCAAGBQJbyFUxAAoJEOJxPX5AJkmFLwYQAIZh4yXLYwc5OCXhrhuTGVTY
h9jQOpMHzrhWqoBP5wrhnnSINl8/i+2HMgIy70/H1hkrPW3Tg6Ps11vkJattjaU7
zXvyUEcBp4kH+csFHJpX/h6LTTbaqjeV/Hn5HNux/taLW7hJlweTK7O+T3JWosBB
RJXucyNVW6aNKTGipKKTU2ACCFLGF0h4U23l6VPdEBzeW3HLIlCZ5uV2lKmGXyKr
EnW8QB8YoRl5LS7CmzfMjVY1lI/+GBl4A1rmwjxyRv1m+I7a+XQIOjY6I/w2AChk
KkB2ruXYw8nnSeyULlxawbEFuV7PmM0lFX3RvOZI7mDva7LPCgLq5xBQqf9wL47a
SSpMgO+IxD0Etc+kSWBYggolZ2zaE6qmNVCDWt9enx5ApZZWOR3QfOZSfLIKfr4w
x2MrbhybwZd2u27xqzu5fWF7iRzPjRdDVqvTRSp79giGzNKlGak9TZ5g9VXXvNb6
DvR8vDprhXXpcrQ7OtKsHgcVolrg+hQ9anmNjjbVcDmRZ6x61SWs951hY9C3OTx8
bsFvX+3xFAMayc4D3o2B96KN5zOOoNAABBiOXXOxTB5nnYhaXH4TsvIfnVmBQ8ex
xaN2+y9SdMyFxLuRVKDsdAc3AjWE7aXCSU3dIWrHO54l9HGDQJ+1Tzbfs/u3akjn
/ul1KZ4+aHgoji3LF/yU
=NZ+T
-----END PGP SIGNATURE-----

Verification with gpg2 shows that the signature is still intact.

$ gpg2 --verbose --verify commit.asc
gpg: armor header: Version: GnuPG v2
gpg: assuming signed data in 'commit'
gpg: Signature made Do 18 Okt 2018 11:41:05 CEST using RSA key ID 40264985
gpg: using subkey 40264985 instead of primary key 4E8E39C1
gpg: using PGP trust model
gpg: Good signature from "Frank Sauerburger <frank@sauerburger.com>" [ultimate]
gpg: binary signature, digest algorithm SHA256, key algorithm rsa4096

We see the signature uses strong cryptographic methods (SHA256 digest and 4096-bit RSA key) however the signed message contains only the SHA-1 of the file tree of this commit. As stated in the quote from shattered.io, it might be possible to prepare an altered version of the repository with the same head commit hash but malicious code. Since it is the same commit hash we can use the identical PGP signature string. Git or GnuPG wouldn’t be able to detect the modification.


This shows that any weakness in the SHA-1-based data model of Git cannot be solved with signed commits. I think the only possible short-term solution for this issues to upgrade to version 2.13. As stated in the GitHub Blog, the long-term solution is probably to migrate to another hash function for the data model of Git.