Delete large file from the commit history in a Git repository

Managing file size within a Git repository is crucial for efficient repository management and quick operations. Occasionally, large files can be committed to a repository by mistake. Below, we discuss various methods to delete large file from the commit history in a Git repository, ensuring your repository remains lean and manageable.

Delete large file from the commit history in a Git repository

 

Prerequisites

Before removing large files from your commit history, you should have a basic understanding of Git commands, a Git client installed, and a backup of your repository to prevent any data loss.

Using Git Filter-Repo

git-filter-repo is a versatile tool for rewriting history, which replaces the older git-filter-branch and BFG. It’s faster and simpler to use.

        # Install git-filter-repo if necessary
        sudo apt-get install git-filter-repo
        # Remove file from the entire history
        git filter-repo --invert-paths --path FILE-TO-REMOVE

Using Git Filter-Branch

The git-filter-branch command is used to rewrite branches, which can be helpful in removing unwanted files from a repository’s commit history.

        # Remove file from the entire history
        git filter-branch --force --index-filter \
        'git rm --cached --ignore-unmatch FILE-TO-REMOVE' \
        --prune-empty --tag-name-filter cat -- --all

Using BFG Repo-Cleaner

The BFG Repo-Cleaner is a tool designed for cleaning up Git repositories. It is especially good for removing large files from the commit history.

        # Requires Java to be installed
        # Download the latest jar of BFG Repo-Cleaner
        java -jar bfg.jar --delete-files FILE-TO-REMOVE my-repo.git

Using Git LFS to Migrate Large Files

While not a deletion method, using Git Large File Storage (LFS) allows you to migrate large files to LFS, making your repository lighter and more manageable.

        # Install Git LFS
        git lfs install
        # Migrate large file to LFS
        git lfs migrate import --include="*.ext" --everything

Pushing Changes to Remote Repository

After rewriting the history and removing large files, you need to forcefully push the changes to the remote repository. However, be cautious as this changes history for all repository collaborators.

        git push origin --force --all
        git push origin --force --tags

Conclusive Summary

Deleting large files from the commit history in a Git repository helps maintain the repository’s efficiency and speed. There is a range of tools available, such as git-filter-repo, git-filter-branch, BFG Repo-Cleaner, and Git LFS, each with its own use cases. After cleanup, remember that a force push is necessary to update the remote repository which may affect other collaborators, so communicate changes clearly with your team.

References