Friday, October 29, 2010

Finding Files On Your Hard Drive
With the Linux Find Command

 
I love the Linux find command. The
find command is used to find files.

Here are some of my favorite things about
the find command:

  1. You can use it to find a file by name
  2. You can use wildcards with it
  3. It automatically descends into folders
    underneath the current folder
  4. It prints out the path to every file
    it finds

The ability of the find command to descend
into directories (folders) is known as recursive
descent. Each layer of directories found under
the current layer of directories is another layer
of recursion. Recursive descent is a well-known
computer algorithm used by many programmers.

Basically, the find command consists of
4 parts:

  1. The name find
  2. Where to start looking
  3. What files to look for
  4. What to do when you find files

Here's an example:

find . -name abc -print

Here's the 4 parts of the above
find command:

  1. The name of the command is find
  2. Dot is the name of the current directory
  3. We are looking for a file called abc
  4. Once a file called abc is found, the
    find command will print the path to it

Here's what happens in plain English:

We start looking in the current directory
(dot or period) for a file called abc.
We will uncover all possible sub-directories
of the current directory. Any files found
that are called abc will be printed.

Here's the only thing that is tricky about
the find command: It shares wildcards with
the shell. This can be trickier than it
sounds. Let's say I wish to find a file
that starts with the letters abc.
I might type the following command:

find . -name abc* -print

This will probably work. As long as
there are no files in the current
directory that start with an abc,
all will be well.

However, lets say we have a file called
abcdef in the current directory.
We are now in trouble. We are in trouble
because the shell is going to do file-name
expansion prior to executing
the find command.

Here's what we type:
find . -name abc* -print

Here's how the shell interprets what we
typed:

find . -name abcdef -print

Do you see the problem? The find command
never sees the asterisk. What happens is that
the find command sees abc* only
after it has been expanded to abcdef.
Big difference!

Of course, there is a way around this and that
is to remove the special meaning of the asterisk
with a backslash. Here's what this would look like:

find . -name abc\* -print

In actual practice, though, the practice of using
backslashes on a command line is very clumsy. Most
people use double quotes instead. Here's what double
quotes look like:

find . -name "abc*" -print

The double quotes escape any special meanings
including the special meaning of the asterisk.
Now we can rest easy and know that our filename
expansion characters will reach the
find command untouched.

The double quotes are a wonderful habit to get
into. Basically, you can use the double quotes
regardless of whether you are using filename
expansion characters or not. Let's say, for
example, we are looking for a file called
abc.

Here's how I might apply the double quotes:

file . -name "abc" -print

In this case, the double quotes do not
matter. Since there are no filename
expansion characters, the double
quotes serve no purpose.

Here's why I use double quotes anyway:

If you always use double quotes, you
never need rethink the find command.
It just works no matter what. Rather
than think whether double quotes are needed,
just use them. They don't cost anything
other than 2 keystrokes.

This is more valuable than it might
appear. When you are in the heat of
battle and you are trying to solve
a problem, considering whether or not
to use double quotes is a mental
distraction.

Rather than suffer the distraction, just
use the double quotes. It's not hard to
figure out whether or not you need double
quotes, but why think about it at all?

Ed Abbott

No comments:

Post a Comment