I had a Wordpress site what was being tracked with Git but I had neglected to correctly manage my .gitignore file to ignore the uploads folder that holds all uploaded images (this resulted in a large repository).
I rectified this by removing the folder and adding it my .gitignore file and cleared the cached files that where previously being tracked.
However this does not remove the files or folders from Git’s history, meaning my repository size was still very large.
To clear the history I had to complete the following steps:
You should always backup your repository before editing the history
First you can see what large files are currently in your history by running:
git rev-list --objects --all | sort -k 2 > large-file-list.txt
This creates a large-file-list.txt file in your root that lists all large files.
Once you have identified the file or folder you want to remove you can run:
git filter-branch --force --index-filter 'git rm --cached -r --ignore-unmatch wp-content/uploads' --prune-empty --tag-name-filter cat -- --all
A full explanation can be found on help.github.com
The only change I made was adding the -r as I was getting this error:
fatal: not removing 'wp-content/uploads' recursively without -r index filter failed: git rm --cached --ignore-unmatch wp-content/uploads
This article highlighted the solution:
“…all I needed to do was add the -r to the rm command part (as below). I guess this is because bin is a directory rather than a file, so it must be removed recursively as otherwise the files it contains (or the information about them) would be orphaned.”
Once you are happy that these files are no longer needed you can force all objects in your local repository to be dereferenced and garbage collected and overwrite your remote repository by following steps 4 to 7 in this guide https://help.github.com/articles/remove-sensitive-data/