Managing file size within a Git repository is crucial for efficient repository management and quick operations. Occasionally, large files can be committed to a repository by mistake. Below, we discuss various methods to delete large file from the commit history in a Git repository, ensuring your repository remains lean and manageable.
Table of Contents
Prerequisites
Before removing large files from your commit history, you should have a basic understanding of Git commands, a Git client installed, and a backup of your repository to prevent any data loss.
Using Git Filter-Repo
git-filter-repo is a versatile tool for rewriting history, which replaces the older git-filter-branch and BFG. It’s faster and simpler to use.
# Install git-filter-repo if necessary
sudo apt-get install git-filter-repo
# Remove file from the entire history
git filter-repo --invert-paths --path FILE-TO-REMOVE
Using Git Filter-Branch
The git-filter-branch command is used to rewrite branches, which can be helpful in removing unwanted files from a repository’s commit history.
# Remove file from the entire history
git filter-branch --force --index-filter \
'git rm --cached --ignore-unmatch FILE-TO-REMOVE' \
--prune-empty --tag-name-filter cat -- --all
Using BFG Repo-Cleaner
The BFG Repo-Cleaner is a tool designed for cleaning up Git repositories. It is especially good for removing large files from the commit history.
# Requires Java to be installed
# Download the latest jar of BFG Repo-Cleaner
java -jar bfg.jar --delete-files FILE-TO-REMOVE my-repo.git
Using Git LFS to Migrate Large Files
While not a deletion method, using Git Large File Storage (LFS) allows you to migrate large files to LFS, making your repository lighter and more manageable.
# Install Git LFS
git lfs install
# Migrate large file to LFS
git lfs migrate import --include="*.ext" --everything
Pushing Changes to Remote Repository
After rewriting the history and removing large files, you need to forcefully push the changes to the remote repository. However, be cautious as this changes history for all repository collaborators.
git push origin --force --all
git push origin --force --tags
Conclusive Summary
Deleting large files from the commit history in a Git repository helps maintain the repository’s efficiency and speed. There is a range of tools available, such as git-filter-repo, git-filter-branch, BFG Repo-Cleaner, and Git LFS, each with its own use cases. After cleanup, remember that a force push is necessary to update the remote repository which may affect other collaborators, so communicate changes clearly with your team.
References
