My Favorite Unix CommandsUnder Linux: October 2011

The title for this article is somewhat
of a misnomer. There is no specific
command that will undelete a file.

This evening, I successfully
undeleted a file. Here's the web
page that helped me to do this:

Linux or UNIX Recover deleted files – undelete files

I followed these 2 basic steps:

Remember a unique string within
the deleted file so that you can
undeleted it without confusing it
with other files
Grep the entire device searching
for the pattern you think uniquely
identifies the file you wish to
undelete

I started with these 2 simple steps
because I wanted to see if the technique
mentioned in the above link was going to
undelete my file.

Here's the command I tried:

grep "Enter the smaller of line 7 or line 9" /dev/hda2 >goofy

I used the df command to come up
with the name of the device to grep on:

df

Doing a grep on a device like this fascinates
me. What basis is there for treating an entire
device (partition) as a file? Does this mean
that all the blocks on that device can, in
effect, be seen by the grep command? It would
seem so.

I watched TV and ate supper while this command
ran. It takes a long long time! Just realize
that when you are doing this, you are greping
your entire hard drive partition. For me this
is many gigabytes of data.

I chose the name goofy as my output file,
probably because I was not sure whether or not
this was going to work.

One of the things that threw me at first is
choosing the pattern to search on. I made
the rather silly assumption that the pattern
was the name of the file.

Of course, that's precisely what got erased
when I mistakenly typed the following command:

rm MyFileName

For whatever reason, I had the silly notion that
I should be searching on MyFileName rather
than the contents of MyFileName. Big
difference!

This silly notion was more or less a fleeting
thought. I only mention it because since I had this
silly notion, other people might have the same
silly notion.

Once my thinking became clear and I realized that
I needed to remember a pattern that was inside the
file, I was well on my way.

Fortunately the file I wished to undelete was an
sc spreadsheet file. These files are quite
primitive and are purely ASCII text. The contents
of these files are easily manipulated by any text
editor.

I'd been working on a tax document when I foolishly
erased the spreadsheet file. I was copying a
Qualified Dividends and Capital Gain Tax Worksheet
for 2010 into my spreadsheet.

I use sc spreadsheet calculator for this purpose
because it is so simple and basic. If you are familiar
with the vi editor, it is very similar to vi
in how you operate it. All the commands are keystrokes
and your hands never have to leave the keyboard to fully
navigate and manipulate your spreadsheet.

The sc spreadsheet calculator is a great tool if
you are doing something primitive and simple but is a very
poor tool if you need to present your data to other people.
It it a poor tool for presentment of data because it is
totally lacking in the things that make a spreadsheet
presentational such as different sized fonts.

I write about sc here:

sc Issues and Answers

I did eventually get all my data back. I used the following command
to do so:

grep -C 3000 "Enter the smaller of line 7 or line 9" /dev/hda2 >goofy

As you can see, this is pretty much the same grep command
that I ran on my hard drive partition called /dev/hda2 earlier
in the evening. They key difference is that now I asked for 3,000
lines of context.

Running the command twice turned out to be a good choice. The first time
I ran it, I wanted to see how many lines had the pattern I was looking for.
If it had been ten or more lines, I might have tried a different pattern.

It turns out that only 5 lines had this pattern. Since sc spreadsheet
calculator effectively has line numbers (to identify the cells) I could
see that the line number was changing as my spreadsheet grew.

Each time I saved the file, it ended up as an artifact on my hard drive
partition. In theory, the 5 times that the pattern appears on a line
could be the 5 times that I saved the file as I worked on it.

This leads me to think that these basic characteristics would make a
pattern a good search pattern:

A good pattern is a unique pattern
If the pattern is not unique, it
will hopefully be mostly unique to the file
you inadvertently erased
If you have edited the file and saved it
to your hard drive many times, the pattern
will hopefully have been introduced into the
file later rather than earlier
A pattern that has been saved many times
(by saving the file) could possibly show up
too many times on your hard drive

I'm not sure how accurate my ideas are on pattern
selection. However, I'm lead to believe that
they are at least a little bit accurate as the
pattern I choose showed up on more than one
line number.

In one case it was line 587 and another case it
was line 589. I don't remember precisely. These
are hypothetical line numbers but I do remember
that the line numbers were very close by to each
other.

This suggests to me that the line numbers grew as
the spreadsheet grew. Knowing which line number
to focus on was easy. The line number that was
the biggest had to be the one that was most recent.

After discovering that the pattern i had chosen
appeared on my hard drive partition 5 times, I
next tried the same grep command with
3,000 lines of context.

My choice of 3,000 lines of context turned out
to be a bit much. I had assumed that grep would
pare the number of lines down to the maximum number
of lines contained in the deleted file. This turned
out to be another mistaken notion.

The mistake I now think I made is in believing that
Linux stores deleted files with a beginning, a middle,
and an end. From the results I got, I'd say that this
is not the case.

It appears that the only limit on the start and end
of the file is the partition itself. In other words,
the entire partition is effectively one big file. That
is to say, the partition is one singular file that goes
on and one.

Therefore, if you ask for 3,000 lines like I did, you
get 3,000 lines.

I now understand why the above webpage so carefully
defines the context by using a grep -A -B.
Using these two options, you can precisely define the
start and end of a deleted text file if you happen to
know how long the file is.

Had I known better, I would have defined the length of
the deleted file more precisely with grep -A -B.

I was able to recover my deleted file completely. The
operation was a success!

Recovery of the file was done with 2 basic steps using
the Vim editor:

Trim the file to its proper length
by getting rid of extraneous lines at
the beginning and end of the file
Fix and corruption that has occurred
in the middle of the file

Amazingly my deleted file was only slightly
corrupt. It had been corrupted by null characters.
That is to say, it appeared to have been overwritten
in one place by characters that are zero (absolute
zero) on the ASCII table. By zero, I mean all eight
bits of the character byte turned off.

Amazingly, only one line of my deleted file was
corrupt. It just so happened that I had not been
editing that line or any line before it. Therefore
I had a backup copy of that one line because I had
a backup copy of the deleted file with an earlier
timestamp.

In other words, I had not deleted all copies of
the file, just the version of the file that
represented the last 45 minutes of my work.

Recovering the one corrupted line was easy. I
simply relied on an earlier version of the file
which had never been deleted.

Was it worth it? It probably was not worth
saving the 45 minutes of work. I may have
spent close to 45 minutes of my time recovering
the file.

However, as an intellectual exercise it was
worth it. Now I know how to recover a deleted
file under Linux.

The big lesson, however, is don't be careless
when deleting files. Had I thought about it
15 more seconds, I would not have deleted the
file.

The life lesson is that rushing gets you nothing
or next to nothing.

Ed Abbott

My Favorite
Unix Commands
Under Linux

Monday, October 10, 2011

The Linux Undelete Command

About Me

Blog Archive