Tuesday, November 1, 2011

The Cure For Argument List Too Long

This is one of those things that drives me
crazy. Why does Linux have an artificial limit
on the size of an argument list? So many things
in Linux are dynamic. Why not shell argument
lists too?

Here's someone who proposes 4 different ways to
solve this problem:

Argument List Too Long

His last solution, solution #4, puts the whole
problem in a nutshell. Apparently this is a
limitation that is defined when you compile your
linux kernel.

Since most of us use precompiled kernels, we're stuck
untess we wish to recompile the kernel.

The variable that controls this is really a C
Language constant (not a true variable). It
is defined for the C Language preprocessor like
this:

#define MAX_ARG_PAGES 32

Perhaps there is a philosophical reason for keeping
things the way they are. Perhaps long command lines
that cause you to over-run the maximum allotment for
an argument list are, in themselves, bad ideas.

An overly long command line could well be a bad idea.

For this reason, I feel that the author of the above
article is probably right. Even though using the
find command takes a long time, it does, by it's very
nature, allow you to process an infinite number of files.

At least, I assume it does. Since the find command
finds one file at a time and then calls up and resolves
one process at a time to process that file, it really
minimizes the footprint (memory used) over speed (the
find command will take a long long time to do what it
does).

Perhaps things are the way they are because someone has
put a lot of thought into it and has decided not to change
things. This is often the case.

Ed Abbott

Monday, October 10, 2011

The Linux Undelete Command

The title for this article is somewhat
of a misnomer. There is no specific
command that will undelete a file.

This evening, I successfully
undeleted a file. Here's the web
page that helped me to do this:

Linux or UNIX Recover deleted files – undelete files

I followed these 2 basic steps:

  1. Remember a unique string within
    the deleted file so that you can
    undeleted it without confusing it
    with other files
  2. Grep the entire device searching
    for the pattern you think uniquely
    identifies the file you wish to
    undelete

I started with these 2 simple steps
because I wanted to see if the technique
mentioned in the above link was going to
undelete my file.

Here's the command I tried:

grep "Enter the smaller of line 7 or line 9" /dev/hda2 >goofy

I used the df command to come up
with the name of the device to grep on:

df

Doing a grep on a device like this fascinates
me. What basis is there for treating an entire
device (partition) as a file? Does this mean
that all the blocks on that device can, in
effect, be seen by the grep command? It would
seem so.

I watched TV and ate supper while this command
ran. It takes a long long time! Just realize
that when you are doing this, you are greping
your entire hard drive partition. For me this
is many gigabytes of data.

I chose the name goofy as my output file,
probably because I was not sure whether or not
this was going to work.

One of the things that threw me at first is
choosing the pattern to search on. I made
the rather silly assumption that the pattern
was the name of the file.

Of course, that's precisely what got erased
when I mistakenly typed the following command:

rm MyFileName

For whatever reason, I had the silly notion that
I should be searching on MyFileName rather
than the contents of MyFileName. Big
difference!

This silly notion was more or less a fleeting
thought. I only mention it because since I had this
silly notion, other people might have the same
silly notion.

Once my thinking became clear and I realized that
I needed to remember a pattern that was inside the
file, I was well on my way.

Fortunately the file I wished to undelete was an
sc spreadsheet file. These files are quite
primitive and are purely ASCII text. The contents
of these files are easily manipulated by any text
editor.

I'd been working on a tax document when I foolishly
erased the spreadsheet file. I was copying a
Qualified Dividends and Capital Gain Tax Worksheet
for 2010 into my spreadsheet.

I use sc spreadsheet calculator for this purpose
because it is so simple and basic. If you are familiar
with the vi editor, it is very similar to vi
in how you operate it. All the commands are keystrokes
and your hands never have to leave the keyboard to fully
navigate and manipulate your spreadsheet.

The sc spreadsheet calculator is a great tool if
you are doing something primitive and simple but is a very
poor tool if you need to present your data to other people.
It it a poor tool for presentment of data because it is
totally lacking in the things that make a spreadsheet
presentational such as different sized fonts.

I write about sc here:

sc Issues and Answers

I did eventually get all my data back. I used the following command
to do so:

grep -C 3000 "Enter the smaller of line 7 or line 9" /dev/hda2 >goofy

As you can see, this is pretty much the same grep command
that I ran on my hard drive partition called /dev/hda2 earlier
in the evening. They key difference is that now I asked for 3,000
lines of context.

Running the command twice turned out to be a good choice. The first time
I ran it, I wanted to see how many lines had the pattern I was looking for.
If it had been ten or more lines, I might have tried a different pattern.

It turns out that only 5 lines had this pattern. Since sc spreadsheet
calculator
effectively has line numbers (to identify the cells) I could
see that the line number was changing as my spreadsheet grew.

Each time I saved the file, it ended up as an artifact on my hard drive
partition. In theory, the 5 times that the pattern appears on a line
could be the 5 times that I saved the file as I worked on it.

This leads me to think that these basic characteristics would make a
pattern a good search pattern:

  1. A good pattern is a unique pattern
  2. If the pattern is not unique, it
    will hopefully be mostly unique to the file
    you inadvertently erased
  3. If you have edited the file and saved it
    to your hard drive many times, the pattern
    will hopefully have been introduced into the
    file later rather than earlier
  4. A pattern that has been saved many times
    (by saving the file) could possibly show up
    too many times on your hard drive

I'm not sure how accurate my ideas are on pattern
selection. However, I'm lead to believe that
they are at least a little bit accurate as the
pattern I choose showed up on more than one
line number.

In one case it was line 587 and another case it
was line 589. I don't remember precisely. These
are hypothetical line numbers but I do remember
that the line numbers were very close by to each
other.

This suggests to me that the line numbers grew as
the spreadsheet grew. Knowing which line number
to focus on was easy. The line number that was
the biggest had to be the one that was most recent.

After discovering that the pattern i had chosen
appeared on my hard drive partition 5 times, I
next tried the same grep command with
3,000 lines of context.

My choice of 3,000 lines of context turned out
to be a bit much. I had assumed that grep would
pare the number of lines down to the maximum number
of lines contained in the deleted file. This turned
out to be another mistaken notion.

The mistake I now think I made is in believing that
Linux stores deleted files with a beginning, a middle,
and an end. From the results I got, I'd say that this
is not the case.

It appears that the only limit on the start and end
of the file is the partition itself. In other words,
the entire partition is effectively one big file. That
is to say, the partition is one singular file that goes
on and one.

Therefore, if you ask for 3,000 lines like I did, you
get 3,000 lines.

I now understand why the above webpage so carefully
defines the context by using a grep -A -B.
Using these two options, you can precisely define the
start and end of a deleted text file if you happen to
know how long the file is.

Had I known better, I would have defined the length of
the deleted file more precisely with grep -A -B.

I was able to recover my deleted file completely. The
operation was a success!

Recovery of the file was done with 2 basic steps using
the Vim editor:

  1. Trim the file to its proper length
    by getting rid of extraneous lines at
    the beginning and end of the file
  2. Fix and corruption that has occurred
    in the middle of the file

Amazingly my deleted file was only slightly
corrupt. It had been corrupted by null characters.
That is to say, it appeared to have been overwritten
in one place by characters that are zero (absolute
zero) on the ASCII table. By zero, I mean all eight
bits of the character byte turned off.

Amazingly, only one line of my deleted file was
corrupt. It just so happened that I had not been
editing that line or any line before it. Therefore
I had a backup copy of that one line because I had
a backup copy of the deleted file with an earlier
timestamp.

In other words, I had not deleted all copies of
the file, just the version of the file that
represented the last 45 minutes of my work.

Recovering the one corrupted line was easy. I
simply relied on an earlier version of the file
which had never been deleted.

Was it worth it? It probably was not worth
saving the 45 minutes of work. I may have
spent close to 45 minutes of my time recovering
the file.

However, as an intellectual exercise it was
worth it. Now I know how to recover a deleted
file under Linux.

The big lesson, however, is don't be careless
when deleting files. Had I thought about it
15 more seconds, I would not have deleted the
file.

The life lesson is that rushing gets you nothing
or next to nothing.

Ed Abbott

Friday, July 15, 2011

Linux Version Information

I can't believe there is no general
command that tells you what version of Linux
you are running. I suppose this is because
Linux is based on Unix and when Unix started,
there was no other version of Unix. If there
is only one version of Unix in existence, there's
not all that much to find out about.

Now times have changed. There are many versions
of Unix out there. Or more to the point, there
are many versions of Linux out there.

Here's an article that describes where to get
Linux version Information for your particular
distribution:

Determining Linux Version

Here are 3 simple steps that should
give you Linux version information:

  1. cd /etc
  2. ls *release
  3. Observe the file
    name that suggests your
    version of Linux
  4. cat filename

I suppose that these steps are simple
enough that no one has bothered to
umbrella all of this under one command.

Ok. I just tried the above steps and
they did not work. It seems that the
word release is not as universal
as the above web page would suggest. It's
probably outdated.

I'll try the above steps with the
word version instead:

  1. cd /etc
  2. ls *version
  3. Observe the file
    name that suggests your
    version of Linux
  4. cat filename

Ok. For my Debian release, the
word version works. Here's
what my screen looks like:

myprompt:/etc$ ls *version
debian_version
myprompt:/etc$ cat debian_version 
5.0.8
myprompt:/etc$ 

I'm using the word version
in a loose way here. I really mean
the release in combination with the
distribution. However, that's too
clumsy to say over and over again.

There's a lot of fiction in brevity.

Ed Abbott

Tuesday, January 18, 2011

Httrack Backs Up Your Website

There's a command that you can install
under Linux. It is called Httrack. It
will backup your website, or any other
website for that matter.

Httrack is called an offline browser.
It only backs up files that are availabe
to a web browser. For this reason, it is
not a great tool for formal website backups.
It's intended more for casual backups.

In many cases, though, Httrack can
be quite useful. Say, for example,
you've been asked to backup a website
by the copyright owner for that website.
For whatever reason, the copyright owner
is unable to get an FTP password
for the website. This can happen if the password is held by someone who is
hostile to the copyright owner.

In this case, the parts of the website
that require no special programming can
be backed up with Httrack. Many websites
are like this. They have no backend
programming that needs to be backed up.
These websites consist of nothing but
purely informational web pages.

The great advantage of Httrack is
its simplicity. Here's the command
to backup up a simple website:

httrack http://www.mywebsite.com/

It's a wonderful wonderful thing to
have tools that scale to the size of
the problem. If you only want a very
very simple backup, why not use a very
very simple tool.

Of course, Httrack has many command
line options too. With these command
line options, you can extend the
capabilities of Httrack.

Ed Abbott

Tuesday, January 4, 2011

The Linux su Command

I have always assumed that the
su command stood for switch
user
. After doing a little reading,
I find that the actual meaning is
substitute user. That's pretty
close to the same thing.

I've made many many false assumptions
regarding acronyms in the past. For years
I thought the pwd command stood
for print working directory. Then
I read it stands for present working directory.
Recently I read that it really is
print working directory
. Go figure.

Maybe it is not so important what these
commands stand for as what they do.

The su command indeed lets you
switch from one user to another. Most
often, I find myself switching to the
root user.

Here are 4 different ways to switch to
the root user:

  1. su
  2. su -
  3. su root
  4. su - root

The first way is very simple. You
switch to root. but many variables,
which are exported into your new
shell environment, remain the same.

Here's an example of something that
will stay the same if I do as in the
first example where I type the su
command without the hypen:

---
me=ed
export me
echo $me
ed
su
Password:
echo $me
ed
---

Notice that I've set a
variable called me
to my first name, ed. Next,
I export the variable. I find
the variable still has my
first name in it after I've
logged in as root.

Let try the same thing with
the su - command. This
time, the variable is not exported:

---
me=ed
export me
echo $me
ed
su -
Password:
echo $me

---

See the difference? If I type
expert $me, I end up with
an empty variable and an empty
line. The difference is that I
type su with a hyphen in
one case and without a hyphen
in the other.

The hypen basically says, behave
as if I logged in as root
. The
absence of the hypen says, I
wish to be root but I'd like to
retain as much of my old environment
as possible.


The 3rd and 4th choices shown above
are the same except that the default
root user is explicitly stated. Of
course, you can explicitly state a
user other than root if you wish.

I switch user to root on average once
a day or so. It is something I frequently
need to do.

If you use Linux on a single-user computer,
like I do, it makes sense to switch user
to root often. There's no one here but
myself to administer my system so that's
the way it needs to be.

The lesson of the su command seems
to be to only give yourself as much authority
and power as you need to get the job done. To
overuse or over-reach your authority can lead
to unintended consequences, especially if you
are being careless.

That's the beauty of Unix. You need only give
yourself the authority you need at that moment
and no more.

One more thing I should mention about the using
su with a hypen. If you use the hypen,
then the login shell for the user that you are
switching to is run. Running the login shell
mimics the user you've switched to almost perfectly.

I'd summarize it this way:

  • su without the hypen mimics the
    permissions of the new user but retains
    much of the environment of the old user
  • su with the hypen mimics the
    permissions and the environment of
    the user you have newly switched
    to as completely as possible

I suppose it is the difference between
fully immersing yourself in something
and retaining a little bit of your old
self.

In life, some tasks require very little
of you and so the tendency is to multi-task
and be more than one person in the same
time frame. Other tasks require your full
attention and your full immersion.

The command with the hypen, su -,
is the full immersion version.

Ed Abbott