Be careful deleting files around git

Saturday 2 May 2015This is close to ten years old. Be careful.

Working in a Python project, it’s common to have a clean-up step that deletes all the .pyc files, like this:

$ find . -name '*.pyc' -delete

This works great, but there’s a slight chance of a problem: Git records information about branches in files within the .git directory. These files have the same name as the branch.

Try this:

$ git checkout -b cleanup-all-.pyc

This makes a branch called “cleanup-all-.pyc”. After making a commit, I will have files named .git/refs/heads/cleanup-all-.pyc and logs/refs/heads/cleanup-all-.pyc. Now if I run my find command, it will delete those files inside the .git directory, and my branch will be lost.

One way to fix it is to tell find not to delete the file if it’s found in the .git directory:

$ find . -name '*.pyc' -not -path './.git/*' -delete

A better way is:

$ find . -name '.git' -prune -o -name '*.pyc' -exec rm {} \;

The first command examines every file in .git, but won’t delete the .pyc it finds there. The second command will skip the entire .git directory, and not waste time examining it.

UPDATE: I originally had -delete in that latter command, but find doesn’t like -prune and -delete together. It seems simplistic and unfortunate, but there it is.

» 9 reactions

Comments

[gravatar]
I always use
git clean -dxf
-- dangerous but effective.
[gravatar]
I always use "invoke clean --all". ☺
[gravatar]
Roger Lipscombe 11:38 AM on 3 May 2015
Don't use dots when naming branches? Always make sure your branches are pushed to origin (so you can always just pull the branch again)? Don't have long-lived branches (so that if you do accidentally lose one, it's no big deal)?
[gravatar]
Always dry run the "find" command to see the files it locates, and adjust to exclude unwanteds.
[gravatar]
Don't you get an error on the last command because you're using -prune and -delete together?
$ find . -name '.git' -prune -o -name '*.pyc' -delete
find: The -delete action atomatically turns on -depth, but -prune does nothing when -depth is in effect.  If you want to carry on anyway, just explicitly use the -depth option.
$ echo $?
1
$ find --version
find (GNU findutils) 4.4.2
[gravatar]
@ryne thanks, I guess I mis-tested that command! I've replaced the -delete with -exec rm {} \;

I've learned a lot about find with this post, some of it I don't agree with, but I've learned it... :)
[gravatar]
Even if you would delete those files, you wont delete any real data. Those files are only references to which hash that branch was on.
You can probably use `git reflog` to find out where it was pointing, do something like `git branch branchname abc123hashhere`. Not tested, but think this works..
[gravatar]
$ echo "# ignore python compiled files:
> # ignore python compiled files:
> *.py[cod]
> " >> >> .gitignore; git commit -m "ignore python compiled files"

;)

On a more serious note, why would you commit Python binaries to your repo?
[gravatar]
@Mick T: I don't commit .pyc files to my repo, and this is not about .pyc files in the working tree. The problem has to do with *.pyc files in the .git directory. They are not compiled Python files at all, they are branches.

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
Comment text is Markdown.