Cloning large repositories can result in quite some load on the server side. Depending on the server specs, the server may run out of RAM or the CPU load increases heavily. In my case, the limiting factor is the CPU. Too much load can even result in fatal errors that make it impossible to freshly clone a repository.
A possible solution is the use of bundles. Git can package a certain revision in an archive. The client can fetch the bundle and set up a clone locally based on the bundle. The Git documentation describes how this works. The server then has the only task of providing the bundle which requires almost no load. When the client has set up the clone with the bundle, subsequent pull or fetch requests will take a lot less server load because the server only needs to handle the diff between the revision archived in the bundle and the revision that currently is fetched.
The Linux kernel project uses bundles on their Git hosting servers and they recommend to directly get the bundle with wget if you have connection problems. The repo tool, which manages the various Git repositories of Android-based operating systems, by default even expects that a bundle with the name clone.bundle is present in every repository on the server during the initial sync. The repo tool automatically fetches the bundles and uses them to set up the individual Git repositories.
Creating bundles on the server
Bundles are easily created inside a Git repository with the command git bundle create clone.bundle $REVISON. $REVISION can be a branch or a tag. In case you have a lot of Git repositories and if all of them are in the same directory, running the following command in the parent directory may be helpful to create bundles in all of them:
for i in *.git; do ( echo $i; cd $i; git bundle create clone.bundle $REVISION; ); done
Making the bundles accessible
If you do your own Git hosting, you probably have a web server like Apache running and some software like cgit serves as Git web frontend behind the web server. As I'm using Gitolite to manage access to my repositories, all repositories reside in the directory /var/lib/gitolite3/repositories.
First, Apache needs to be told where it can find the bundles:
AliasMatch ^/(.*).git/clone.bundle /var/lib/gitolite3/repositories/$1.git/clone.bundle AliasMatch ^/(.*)/clone.bundle /var/lib/gitolite3/repositories/$1.git/clone.bundle
These directives make sure that regardless of whether the URL contains the .git suffix, Apache finds the corresponding *.git folder.
Then clients need to be allowed to access the bundles in the git repositories:
<Directory /var/lib/gitolite3/repositories/> Require all denied <FilesMatch "clone.bundle"> Require all granted </FilesMatch> </Directory>
This ensures that only files named clone.bundle are accessible.