How to Use "Sparse Checkout" to Manage Large Git Repositories

If you have ever worked with a repository containing thousands of files, you are certainly familiar with the frustration of waiting for basic commands like git status or git checkout to complete. Commands that usually run in milliseconds in small repositories can be surprisingly slow (and test your patience) in larger repositories.

This frustration is amplified when you're focusing on a small part of the codebase and don't need to interact with 99% of the files.

You may be wondering: "Why keep track of everything if I am only working on a small part of the project? Wouldn't it be great if I could just tell Git to just keep track of a couple of directories I'm working on?"

Well, good news! The sparse-checkout command does exactly that! ✌️

It enables you to work with only a specific set of files from a repository, rather than the entire repository. This allows you to benefit from a monorepo while ensuring efficient Git performance — the best of both worlds!

Introduced in Git 2.25.0, this command was designed with large Git repositories in mind, particularly monorepos. While a similar functionality was available through the core.sparsecheckout config option previously, this new command greatly streamlines the process.

Let's see how it works!

Tip

Performance in Git: The Complete Guide

For more tips on how to improve performance in Git, check out our complete guide!

Getting Started with `sparse-checkout`

Initiating sparse-checkout is quite simple. To set it up, run the following command:

$ git sparse-checkout init

This command configures the necessary settings, telling Git that you will specify which parts of the project you want to work with.

Now you just need to specify which directories you require! You can do so by typing the following:

$ git sparse-checkout set <path1> <path2> ...

This command downloads only the necessary parts and enables you to access them in your working directory. For example, if your focus is on an Electron app located within the /client directory, you can use a command like the following to work with it:

$ git sparse-checkout set client/electron

If you wish to add more paths later, you can use the add subcommand. To view the current settings, you can rely on the list subcommand, as illustrated below:

$ git sparse-checkout add <new_path>
$ git sparse-checkout list

If you encounter any issues or wish to revert to having the full working directory available, you can disable sparse-checkout by entering the following:

$ git sparse-checkout disable

Better Performance with Cone Mode

For more efficient sparse checkouts, especially in really large repositories, you can leverage Cone Mode, introduced in Git 2.27.0. The name makes a lot of sense since it creates a cone-shaped subset of the repository tree, including all parent directories of specified paths.

Cone mode automatically includes parent directories and is often faster due to its pattern matching capabilities. It only allows full directory paths, not individual files or complex patterns — resulting in much faster processing compared to the regular sparse-checkout command.

This way, Git can quickly determine if a path should be included without complex regex evaluations, as it only needs to check if a path is within the "cone" of specified directories.

To benefit from this feature, simply add the --cone flag when initializing the sparse-checkout command:

$ git sparse-checkout init --cone

You can then proceed to add directories as detailed above.

What About Disk Space?

You may be surprised to learn that Sparse Checkout does not inherently save disk space. In reality, all objects are still downloaded and stored in the local .git directory. Its purpose is to reduce the number of files Git needs to scan for status updates.

To save up disk space, you should consider Partial Cloning, which allows you to clone a repository without downloading all of its objects.

Here's how you can perform a partial clone:

$ git clone --filter=blob:none --sparse <repository-url>

This command significantly reduces both initial download size and local storage requirements. It exclusively downloads the tree objects initially, without any file content. It will then fetch each file's contents on-demand, as you work.

You can then run the sparse-checkout commands mentioned earlier. They work really well in combination!

Learn More

How to Improve Performance in Git: The Complete Guide

How to Use "Sparse Checkout" to Manage Large Git Repositories

Tip

Performance in Git: The Complete Guide

Getting Started with sparse-checkout

Better Performance with Cone Mode

What About Disk Space?

Learn More

Related Questions

Get our popular Git Cheat Sheet for free!

About Us

Getting Started with `sparse-checkout`