A Guide to Setting Up Git, Gitosis, and Gitweb
Posted by HokieTux on March 3, 2009 in Guides
I have recently migrated to using Git for my (D)SCM-of-choice, and so far, I am extremely happy with it. I am hosting public and private Git trees on this server, and using it as the ‘central node’ for most of my coding and configs. I am using Gitweb for the web interface and Gitosis to manage secure commits from multiple users and machines. While getting Git set up was a piece of cake, getting everything else up and running wasn’t quite so smooth. This is a guide for getting the whole system up and running quickly and easily so you can get back to hacking =)
The meat of this post is about Gitweb, for which there seems to be far fewer useful guides and docs out there (relative to the number for Git and Gitosis). This is also the part that I found the most difficult and time-consuming. Hopefully, with this guide, it’ll be a much smoother ride for you.
Why I Use Git
There are a metric tonne of sites about why you should use Git, so I won’t try to convince you. I will, briefly, however, go over my reasons for choosing it:
- People and projects I am working with are using it – This is extremely important, especially if you plan on making any sort of contributions to these projects. As a computer engineer, I have a significant amount of interaction with Linux kernel trees (like the one over at Xilinx, for example), and nearly all such projects are using Git as their SCM.
- It works with projects that aren’t using it – The only projects I work with that aren’t using Git use Subversion. Thankfully, the ‘git svn’ tools work beautifully, and allow me to use Git to interact with svn repos. This is very significant, as it allows me to further simplify my workflow into one common thread.
- Git is quickly becoming a valuable skill – Knowing SCMs has been, and always will be, valuable skills to employers. The situation used to be that knowing CVS was a major selling point. Then it moved to Subversion. People are starting to realize the benefits of distributed SCMs, and focus is transitioning to DSCMs. Of these, Git is at the fore-front in terms of user base and company adoption.
- It is small & fast- I don’t have infinite bandwidth, and I have a lot of interaction with code repositories. Being able to clone Git trees and commit new code without transmitting huge amounts of data is important to me.
- Third party tools and community – It is almost undeniable that of all the DSCMs out there (Bazaar, Mercurial, darcs, svk, and Git), Git has the largest user base and most active community. As a result, development for third party tools and web utilities progresses faster, and there are more options.
Disclaimer: I admit that some of these reasons might sound a bit like ‘jumping on the bandwagon’. But hey, such is life. Git is a very successful SCM for a number of reasons. If you prefer another SCM, I encourage you to post your reasons as a comment to this post. Telling me I’m stupid for joining the Git party is silly and I will ignore you / delete your comment. Different SCMs work for different people. I encourage you to pick the one that is best for you.
Installing Git
This should be straight-forward. Every package manager out there should have Git in their stable repos. If not, you might want to double-check that your distro isn’t defunct
Setting Up Gitosis (on non-standard SSH ports!)
To keep the whole set-up secure and traceable, I decided that I wanted to make all commits go through SSH. Without Gitosis, this would cause headaches because you would need to create an SSH account for every user commiting to your Git tree, even if you didn’t want that user to have a real shell account on your server. Gitosis allows multiple users to commit to Git trees, and manages all of them with a single user account; all it requires is an SSH key for each user. The whole system is extremely clever.
The Gitosis installation documentation is excellent, and very easy to follow for the most part. Additionally, there is an excellent write-up on getting Gitosis set-up and running, found on the scie.nti.st blog. This post has actually been mirrored & copied in a number of locations across the web. Instead of doing that, I will direct you to the original post, and merely add some notes about the installation. I recommend reading my notes below, and then going through the blog post.
The blog post can be found here. Note that there is an example Gitosis conf here, provided by the Gitosis devs.
First, some notes about my system setup:
- I want to use port 2222 for SSH.
- The base directory for my ‘Git Server’ is ‘/srv/git‘
- I am running ArchLinux <3
Now then, for my notes regarding the installation:
- You obviously need to have SSH running on your machine. If you haven’t done this yet, there are many articles out there on how to get this up and going.
- You need to have Python SetupTools installed on your system. This comes standard on some systems. On more minimalist distributions, it likely does not. It should be easily found with your package manager if it isn’t. Otherwise, you can always grab it from the site (linked above).
- I had to modify the step to add the ‘git’ user to the system. The command I used was:
sudo useradd --system --shell /bin/sh --comment 'git version control' --user-group --home-dir /srv/git/ git
- The step that tells you to generate an SSH public key is for your user, not the ‘git’ user you just created! If this seems confusing to you, then you likely don’t understand exactly what Gitosis does. Gitosis keeps an SSH key for each user that will be commiting to your server. When someone tries to commit, it compares the username@hostname SSH key on the committing machine to the key it has on file. If they match, then it commits to the Git tree as that user, even though that user account doesn’t necessarily exist on the Git server machine.
- Due to my non-standard SSH port, when I want to clone something (like the gitosis-admin tree, for example), I do:
git clone ssh://git@hokietux.net:2222/gitosis-admin.git
- Remember that if you try to inspect the Gitosis configuration file inside of the Git tree after pushing it (not in your cloned copy of the tree, but in the tree itself; for me, that would be /srv/git/repositories/gitosis-admin.git/gitosis.conf), you will not see any changes made. This is normal.
- Despite #4, the SSH keys you add to Gitosis should show up in the Git tree. If after adding and pushing a new SSH key to the gitosis-admin tree, you want to double check you did it correctly, head to /srv/git/repositories/gitosis-admin.git/gitosis-export/keydir. You should see all of your SSH keys there. Note: Every key must end in a .pub extension! If things aren’t working, double check this!
You should now have Gitosis fully up and running, be able to clone & push from remote machines. Double check this before heading on to Gitweb!
Setting Up Gitweb
Installing and configuring Gitweb is actually quite simple. Unfortunately, the Gitweb INSTALL and README docs are horribly convoluted and hard to follow. I must have read them both three times, and still had no idea how to properly configure Gitweb. I will provide my Gitweb configuration file and explain the parts that perhaps are not quite so obvious.
First, some information about my system so that this all makes sense:
- My Apache root is /srv/http
- My cgi-bin directory is /srv/http/cgi-bin
- My Git trees are in /srv/git/repositories
- My Gitweb files are in /srv/http/cgi-bin/gitweb
Notice that my Git trees are located in a different directory than my Apache root. It took me a while to work out the configuration settings that would make this work correctly, but I finally got it up and running. The configurations provided below support this system layout.
Before you get started, you need to have Apache installed and running. If you don’t already have Apache up and running, there are plenty of other guides and wiki pages about that process. If you are using Arch, the LAMP ArchWiki page is fantastic. The remainder of this guide will assume that Apache is up and running properly.
Now then, you will need the Gitweb files. Gitweb is actually distributed with most modern ‘Git’ packages (i.e. if you have installed Git via your package manager, you likely have Gitweb as well). Gitweb is really just a collection of files:
daedalus hokietux /usr/share/gitweb 980 $ ls INSTALL README git-favicon.png gitweb.perl* git-logo.png gitweb.cgi* gitweb.css
If you think about it, it makes sense. You are really just rendering a web page with dynamic content. Anyways, it should be somewhere on your system. Most systems seem to place it in /usr/share/gitweb or /usr/share/git/gitweb. If it isn’t there, you can probably locate it with a $ locate gitweb.cgi (assuming, of course, that you have done $ sudo updatedb recently).
Copy the Gitweb files to whereever you are hosting them out of and give ownership to your Apache user:
daedalus hokietux ~ 981 $ sudo cp -R /usr/share/gitweb /srv/http/cgi-bin/ daedalus hokietux ~ 982 $ sudo chown -R apache:apache /srv/http/cgi-bin/gitweb
Now then, we need to configure your Gitweb conf file, which should be in /etc/gitweb.conf. My most recent gitweb.conf is always stored in my public Git configs tree; the gitweb.conf is here. The comments in the file are short and sweet. Below is a copy of the file with comments fleshed out in more detail in case you need extra explanation:
# Gitweb configuration file
# Ben Hilburn, hokietux.net
#
# Location of the git binary
$GIT = "/usr/bin/git";
# Project root for gitweb. This is the parent directory for all
# of your Git trees. As an example, 'gitosis-admin.git' should reside
# in this directory.
$projectroot = "/srv/git/repositories";
# Web display files. These are all _relative_ paths from the active
# gitweb.cgi file. If all three of these files are located in the
# same directory as gitweb.cgi (/srv/http/cgi-bin/gitweb, in my case),
# then the below settings should work fine. Remember that if they are in
# a different directory, you will need to give your Apache user/group read
# access to them!
$stylesheet = "/gitweb.css";
$logo = "/git-logo.png";
$favicon = "/git-favicon.png";
# Site name
$site_name = "HokieTux's Git Trees";
# URL formatting. You can use this to make pretty URLs if you like. I am
# doing this using Apache rewrite rules (covered later in this guide), and
# so am not using these settings.
#$my_uri = "http://git.hokietux.net/";
#$home_link = $my_uri;
# Base URL for project trees. This is used to prefix each of the Git trees
# on the webpages. So in my case, if you were viewing a Git tree called
# 'foo.git', the webpage would tell you that the tree was located at:
# 'ssh://git@hokietux.net:1123/foo.git'. Note that escaping the '@'
# character is necessary to render the URL properly.
@git_base_url_list = ("ssh://git\@hokietux.net:1123");
# Length of the project description column in the webpage.
$projects_list_description_width = 50;
# Only export repos we are allowing to be publically cloned. What this setting
# actually says is that if the given file _exists_ in the git repository, then
# the tree can be exported to the web. So, for example, the file:
# /srv/git/repositories/configs.git/git-daemon-export-ok file exists, so
# configs.git will be exported via Gitweb. This file can be created with a
# simple '$ touch git-daemon-export-ok'. I am using this filename as it doubles
# for the same use with the Git export daemon. If this setting does not exist,
# then all trees will be exported by default. Note that there ARE other methods
# for controlling which trees get exported. This is just the one I prefer.
$export_ok = "git-daemon-export-ok";
# Enable PATH_INFO so the server can produce URLs of the
# form: http://git.hokietux.net/project.git/xxx/xxx
# This allows for pretty URLs *within* the Git repository, where
# my Apache rewrite rules are not active.
$feature{'pathinfo'}{'default'} = [1];
# Enable blame, pickaxe search, snapshop, search, and grep
# support, but still allow individual projects to turn them off.
# These are features that users can use to interact with your Git trees. They
# consume some CPU whenever a user uses them, so you can turn them off if you
# need to. Note that the 'override' option means that you can override the
# setting on a per-repository basis.
$feature{'blame'}{'default'} = [1];
$feature{'blame'}{'override'} = [1];
$feature{'pickaxe'}{'default'} = [1];
$feature{'pickaxe'}{'override'} = [1];
$feature{'snapshot'}{'default'} = [1];
$feature{'snapshot'}{'override'} = [1];
$feature{'search'}{'default'} = [1];
$feature{'grep'}{'default'} = [1];
$feature{'grep'}{'override'} = [1];
Note that the the $export_ok setting is very important! If you have Gitweb up and running, and can’t see any trees, double-check this setting! Comment it out to make sure that you haven’t accidently told Gitweb not to export your trees!
Gitweb, DNS, and URLs Resolving
Many people told me that I shouldn’t have to make a separate DNS entry for my prefixed site (i.e. http://git.hokietux.net verses http://hokietux.net), but I found that not to be true for my set-up. I am using Slicehost for my hosting, and just as I need an entry for ‘www’ (i.e. http://www.hokietux.net versus http://hokietux.net), I needed an entry for http://git.hokietux.net. Now, I know next to nothing about DNS and how different hosting solutions handle DNS entries. You may or may not need to do anything to make a URL like git.yourdomain.com resolve; if you are having trouble though, make sure you look into this.
Now that Gitweb is set-up and configured, it’s time to tell Apache to host it! Hold onto your helmets, and let’s get to it…
Setting Up Gitweb and Apache
This was the hardest part of this whole system for me. Mostly because I don’t know Apache that well, but then again, most people don’t. I will go over two ways to get it up and running:
- Apache Name-Based Virtual Hosts – The ideal solution. If you have root access to your server and can modify your Apache confs, this is likely the route you want to take. Unless you want the guaranteed easy-way-out. Which would be…
- The .htaccess File – Requires a lookup each time someone navigates to your website, but it is easier to set-up and is the only option for people without root access to their servers.
I went the Virtual Hosts route, and while it took some time, some pounding of keyboards, and some questions in #git, it is up and running and I am a happy camper. Remember that you only need to do one of the two options above! Doing both will likely anger Apache, and then you will be really screwed.
Setting Up Gitweb with Apache Name-Based Virtual Hosts
There are a couple of different places you can define vhosts for Apache. You can do it either in the Apache configuration itself (for me, that’s /etc/htppd/conf/httpd.conf), or in a supplementary configuration file which is then included from the master configuration. This is how my system works, and the supplementary configuration for Apache on my system is /etc/htppd/conf/extra/htppd-vhosts.conf. The latter method is cleaner, and I find it easier to use. Just make sure you are including the supplementary conf with something like this in your master configuration file (httpd.conf):
# Virtual hosts Include conf/extra/httpd-vhosts.conf
Now then, if you are going to use virtual hosts, you want to be careful that you aren’t telling Apache to serve a directory in two different places. As an example, if you are serving your Apache root in your Apache conf with something like this:
# This should be changed to whatever you set DocumentRoot to.
<Directory "/srv/http">
Options Indexes FollowSymLinks
AllowOverride None
Order allow,deny
Allow from all
</Directory>
… you will want to remove that before you set up your virtual hosts. My httpd.conf file contains global configurations and settings, but does not specify which particular directories to serve. Before we get to configuring your virtual hosts, however, you do need to make sure that some global settings are configured properly. Specifically, settings pertaining to cgi scripts. Make sure something like this appears in your httpd.conf (comments provided by the Arch devs =) ):
<IfModule alias_module> # # ScriptAlias: This controls which directories contain server scripts. # ScriptAliases are essentially the same as Aliases, except that # documents in the target directory are treated as applications and # run by the server when requested rather than as documents sent to the # client. The same rules about trailing "/" apply to ScriptAlias # directives as to Alias. # ScriptAlias /cgi-bin/ "/srv/http/cgi-bin/" </IfModule> # # "/srv/http/cgi-bin" should be changed to whatever your ScriptAliased # CGI directory exists, if you have that configured. # <Directory "/srv/http/cgi-bin"> Options Indexes FollowSymlinks ExecCGI AllowOverride None Order allow,deny Allow from all </Directory> # Virtual hosts <-- DON'T FORGET THIS! Include conf/extra/httpd-vhosts.conf
Now then, let’s get httpd-vhosts.conf configured properly. My httpd-vhosts.conf is below, and an explanation follows (comments, again, provided by a friendly Arch dev somewhere):
#
# Virtual Hosts
#
# If you want to maintain multiple domains/hostnames on your
# machine you can setup VirtualHost containers for them. Most configurations
# use only name-based virtual hosts so the server doesn't need to worry about
# IP addresses. This is indicated by the asterisks in the directives below.
#
# Please see the documentation at
# <URL:http://httpd.apache.org/docs/2.2/vhosts/>
# for further details before you try to setup virtual hosts.
#
# You may use the command line option '-S' to verify your virtual host
# configuration.
#
# Use name-based virtual hosting.
#
NameVirtualHost *:80
#
# Almost any Apache directive may go into a VirtualHost container.
# The first VirtualHost section is used for all requests that do not
# match a ServerName or ServerAlias in any <VirtualHost> block.
#
<VirtualHost *:80>
ServerName hokietux.net
DocumentRoot "/srv/http"
<Directory "/srv/http">
Options Indexes FollowSymLinks
AllowOverride None
Order allow,deny
Allow from all
</Directory>
</VirtualHost>
<VirtualHost *:80>
ServerName git.hokietux.net
DocumentRoot "/srv/http/cgi-bin/gitweb"
DirectoryIndex gitweb.cgi
SetEnv GITWEB_CONFIG /etc/gitweb.conf
<Directory "/srv/http/cgi-bin/gitweb">
Options FollowSymlinks ExecCGI
Allow from all
AllowOverride all
Order allow,deny
<Files gitweb.cgi>
SetHandler cgi-script
</Files>
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^.* /gitweb.cgi/$0 [L,PT]
</Directory>
<Directory "/srv/git/repositories">
Allow from all
</Directory>
# To debug rewrite rules, which were very painful to figure out
RewriteLog /var/log/httpd/rewrite_log
RewriteLogLevel 9
ErrorLog /var/log/httpd/gitweb
</VirtualHost>
Notice that the first entry is the default url (hokietux.net). This is important, as all addresses not corresponding to other vhost entries will default to the first one.
The part of the file above pertaining to Gitweb is clearly the second vhost entry. Remember that my Gitweb document root is /srv/http/cgi-bin/gitweb, but my Git trees are in /srv/git/repositories. The rewrite rules you see above make that possible. If you are getting permission denied errors or 404s when you try to access your Git trees via Gitweb, then you might have to tinker with the rewrite rules a bit. If this is the case, then you have my deepest condolences. Also, feel free to post your problems here and I will try to help =)
Now then, restart Apache, cross your fingers, and give it a go! Hopefully, Gitweb comes up and displays all of your Git trees correctly.
Setting Up Gitweb with Apache and an .htaccess File
This is not the route I took, but is the only option for people without root access to their hosting. The example file below was provided by Jeff Mickey, who happens to also be an Arch Dev.
Make sure your .htaccess file won’t be served by Apache with something like this in your global httpd.conf file:
<FilesMatch "^.ht"> Order allow,deny Deny from all Satisfy All </FilesMatch>
Create an .htaccess file in your Apache document root, and make it look something like this:
## Turn on CGI
Options +ExecCGI
AddHandler cgi-script .cgi
## Turn on rewriting so I can get gitweb.cgi out of the url
RewriteEngine On
RewriteBase /
RewriteRule ^$ gitweb.cgi [L]
## Send requests for files and directories that exist to the correct location
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule (.*) gitweb.cgi/$1 [QSA,L]
Now then, any URL that looks like git.yourdomain.com should be resolved properly, assuming your Gitweb files are in your cgi-bin directory (i.e. cgi-bin/gitweb); if you aren’t managing your own hosting, your cgi-bin directory is likely specified for you.
Conclusion & Further Reading
You should now (hopefully) be totally up and running with secure Git repository hosting and access via Gitosis, as well as public Git tree publishing via Gitweb! If you have any questions or run into problems, please feel free to post here and I will try my best to help! Alternatively, you can check out any of the sites linked below, or hop into #git on Freenode. If you found this post useful, post a comment and let me know =)
- Blog post on setting up Gitosis, re-linked: http://scie.nti.st/2007/11/14/hosting-git-repositories-the-easy-and-secure-way
- Forum thread where several common problems with Gitosis are resolved: http://forum.webfaction.com/viewtopic.php?id=2321
- Example Gitweb configuration: http://ianloic.com/2007/09/13/how_i_set_up_gitweb/
- Alternate vhost conf for Gitweb: http://www.philsergi.com/2008/04/gitweb-apache-gentoo.html
4 Comments on A Guide to Setting Up Git, Gitosis, and Gitweb
Thanks for this article. I agree that the gitweb doc out there is not very clear. It has been frustrating to get gitweb going. I also used http://scie.nti.st/2007/11/14/hosting-git-repositories-the-easy-and-secure-way to get gitosis installed and the ssh access – this is working well.
I too know very little about apache. I am running Apache2 on a SUSE 11.1 distro. After following your gitweb instructions, the first problem I run into when I try to restart apache2 is:
>>>
# /etc/init.d/apache2 reload
Reload httpd2 (graceful restart)
Syntax error on line 130 of /etc/apache2/vhosts.d/git-vhost.conf:
Invalid command ‘SetEnv’, perhaps misspelled or defined by a module not included in the server configuration
The command line was:
/usr/sbin/httpd2-prefork -f /etc/apache2/httpd.conf
unused
<<>>
Invalid command ‘RewriteEngine’, perhaps misspelled or defined by a module not included in the server configuration
<<<
Any help will be greatly appreciated.
Joe
Hey Joe -
Sorry about the delay in getting back to you! I have been traveling for the last few days.
It looks like you are missing a couple of Apache modules. Make sure, in your httpd.conf, you are including the modules:
mod_env
mod_rewrite
Those modules provide the commands you are missing.
Let me know if this helps!
Thanks for the great tutorial! Turns out that some of my repositories didn’t show up because they didn’t have the proper rights, and doing a ‘chmod 775 repository_dir’ fixed it.
Thijs -
I’m glad it helped you! Thanks for the tip on the repo rights
Subscribe
Follow comments by subscribing to the A Guide to Setting Up Git, Gitosis, and Gitweb Comments RSS feed.