My Favorite Unix CommandsUnder Linux

Thursday, March 6, 2014

Unraveling the Mysteries of the Linux Column Command

I'm trying to figure out the Linux column command this morning. To my mind, it does not work the way I think it should.

Here are the options to the column command as given on the man page:

column [-ntx] [-c columns] [-s sep] [file ...]

One of the mysteries of the column command for me is that the -c switch does not seem to work. Perhaps this is because my Linux is out of date. I'm still on Debian Lenny and today is March 6, 2014. This man page for column seems to suggest that the -c option works:

Column Man Page From October 2010

I love the description of the -c option. Being able to format a table to a specific number of characters is exactly what I want to be able to do! Unfortunately, I've never ever been able to make it work. The -c option does not seem to alter my output in any way no matter what I try.

Fortunately both the -s option and the -t option seem to work just fine under Debian. These options make the formatting of a table on a specified delimiter automatic.

The -t option seems to be the column commands most powerful feature. With -t, newlines are assumed to be new column rows. Thus, each non-blank line of input becomes a new row in your table. For example, 80 lines of input would translate into an 80-line table.

The best part of the -t option? It figures out the width of the columns for you. In other words, it shrink wraps. In this respect it is much like the <table> tag of HTML. All tabular data is shrink wrapped to fit as needed.

Here's a web page that describes the -n option. This option is necessary to create empty cells in your table:

Why does the column command not align columns properly?

Without the -n option, multiple delimiters consolidate to become just one delimiter. Of course, in some cases, this is the desired behavior.

However, in a CSV file, two commas in a row often indicate an empty field. With the -n option present, the integrity of the empty field is respected and a empty table cell is created.

I suspect that over time, the behavior of the column command will become more and more standardized. The above link concerning aligning columns properly seems to suggest that the -n option is not universally available. It does not seem to be working under Slackware 9.0 (very old!).

If I could get the -c option to work, I'd be in heaven!

Ed Abbott

Friday, July 5, 2013

When Did I Last Logout?

I sometimes wish to know when I last logged off my Linux laptop. This command lets me know my log in and log out history:

last | more

The last command is all about the last time you logged in. Since its output is in reverse chronological order, it helps to pipe it to the more command so that what you want to know does not run away on you by scrolling down the screen.

The last command is one of many commands that makes Linux so convenient once you learn how to use it.

Ed Abbott

Tuesday, November 1, 2011

The Cure For Argument List Too Long

This is one of those things that drives me
crazy. Why does Linux have an artificial limit
on the size of an argument list? So many things
in Linux are dynamic. Why not shell argument
lists too?

Here's someone who proposes 4 different ways to
solve this problem:

Argument List Too Long

His last solution, solution #4, puts the whole
problem in a nutshell. Apparently this is a
limitation that is defined when you compile your
linux kernel.

Since most of us use precompiled kernels, we're stuck
untess we wish to recompile the kernel.

The variable that controls this is really a C
Language constant (not a true variable). It
is defined for the C Language preprocessor like
this:

#define MAX_ARG_PAGES 32

Perhaps there is a philosophical reason for keeping
things the way they are. Perhaps long command lines
that cause you to over-run the maximum allotment for
an argument list are, in themselves, bad ideas.

An overly long command line could well be a bad idea.

For this reason, I feel that the author of the above
article is probably right. Even though using the
find command takes a long time, it does, by it's very
nature, allow you to process an infinite number of files.

At least, I assume it does. Since the find command
finds one file at a time and then calls up and resolves
one process at a time to process that file, it really
minimizes the footprint (memory used) over speed (the
find command will take a long long time to do what it
does).

Perhaps things are the way they are because someone has
put a lot of thought into it and has decided not to change
things. This is often the case.

Ed Abbott

Monday, October 10, 2011

The Linux Undelete Command

The title for this article is somewhat
of a misnomer. There is no specific
command that will undelete a file.

This evening, I successfully
undeleted a file. Here's the web
page that helped me to do this:

Linux or UNIX Recover deleted files – undelete files

I followed these 2 basic steps:

Remember a unique string within
the deleted file so that you can
undeleted it without confusing it
with other files
Grep the entire device searching
for the pattern you think uniquely
identifies the file you wish to
undelete

I started with these 2 simple steps
because I wanted to see if the technique
mentioned in the above link was going to
undelete my file.

Here's the command I tried:

grep "Enter the smaller of line 7 or line 9" /dev/hda2 >goofy

I used the df command to come up
with the name of the device to grep on:

df

Doing a grep on a device like this fascinates
me. What basis is there for treating an entire
device (partition) as a file? Does this mean
that all the blocks on that device can, in
effect, be seen by the grep command? It would
seem so.

I watched TV and ate supper while this command
ran. It takes a long long time! Just realize
that when you are doing this, you are greping
your entire hard drive partition. For me this
is many gigabytes of data.

I chose the name goofy as my output file,
probably because I was not sure whether or not
this was going to work.

One of the things that threw me at first is
choosing the pattern to search on. I made
the rather silly assumption that the pattern
was the name of the file.

Of course, that's precisely what got erased
when I mistakenly typed the following command:

rm MyFileName

For whatever reason, I had the silly notion that
I should be searching on MyFileName rather
than the contents of MyFileName. Big
difference!

This silly notion was more or less a fleeting
thought. I only mention it because since I had this
silly notion, other people might have the same
silly notion.

Once my thinking became clear and I realized that
I needed to remember a pattern that was inside the
file, I was well on my way.

Fortunately the file I wished to undelete was an
sc spreadsheet file. These files are quite
primitive and are purely ASCII text. The contents
of these files are easily manipulated by any text
editor.

I'd been working on a tax document when I foolishly
erased the spreadsheet file. I was copying a
Qualified Dividends and Capital Gain Tax Worksheet
for 2010 into my spreadsheet.

I use sc spreadsheet calculator for this purpose
because it is so simple and basic. If you are familiar
with the vi editor, it is very similar to vi
in how you operate it. All the commands are keystrokes
and your hands never have to leave the keyboard to fully
navigate and manipulate your spreadsheet.

The sc spreadsheet calculator is a great tool if
you are doing something primitive and simple but is a very
poor tool if you need to present your data to other people.
It it a poor tool for presentment of data because it is
totally lacking in the things that make a spreadsheet
presentational such as different sized fonts.

I write about sc here:

sc Issues and Answers

I did eventually get all my data back. I used the following command
to do so:

grep -C 3000 "Enter the smaller of line 7 or line 9" /dev/hda2 >goofy

As you can see, this is pretty much the same grep command
that I ran on my hard drive partition called /dev/hda2 earlier
in the evening. They key difference is that now I asked for 3,000
lines of context.

Running the command twice turned out to be a good choice. The first time
I ran it, I wanted to see how many lines had the pattern I was looking for.
If it had been ten or more lines, I might have tried a different pattern.

It turns out that only 5 lines had this pattern. Since sc spreadsheet
calculator effectively has line numbers (to identify the cells) I could
see that the line number was changing as my spreadsheet grew.

Each time I saved the file, it ended up as an artifact on my hard drive
partition. In theory, the 5 times that the pattern appears on a line
could be the 5 times that I saved the file as I worked on it.

This leads me to think that these basic characteristics would make a
pattern a good search pattern:

A good pattern is a unique pattern
If the pattern is not unique, it
will hopefully be mostly unique to the file
you inadvertently erased
If you have edited the file and saved it
to your hard drive many times, the pattern
will hopefully have been introduced into the
file later rather than earlier
A pattern that has been saved many times
(by saving the file) could possibly show up
too many times on your hard drive

I'm not sure how accurate my ideas are on pattern
selection. However, I'm lead to believe that
they are at least a little bit accurate as the
pattern I choose showed up on more than one
line number.

In one case it was line 587 and another case it
was line 589. I don't remember precisely. These
are hypothetical line numbers but I do remember
that the line numbers were very close by to each
other.

This suggests to me that the line numbers grew as
the spreadsheet grew. Knowing which line number
to focus on was easy. The line number that was
the biggest had to be the one that was most recent.

After discovering that the pattern i had chosen
appeared on my hard drive partition 5 times, I
next tried the same grep command with
3,000 lines of context.

My choice of 3,000 lines of context turned out
to be a bit much. I had assumed that grep would
pare the number of lines down to the maximum number
of lines contained in the deleted file. This turned
out to be another mistaken notion.

The mistake I now think I made is in believing that
Linux stores deleted files with a beginning, a middle,
and an end. From the results I got, I'd say that this
is not the case.

It appears that the only limit on the start and end
of the file is the partition itself. In other words,
the entire partition is effectively one big file. That
is to say, the partition is one singular file that goes
on and one.

Therefore, if you ask for 3,000 lines like I did, you
get 3,000 lines.

I now understand why the above webpage so carefully
defines the context by using a grep -A -B.
Using these two options, you can precisely define the
start and end of a deleted text file if you happen to
know how long the file is.

Had I known better, I would have defined the length of
the deleted file more precisely with grep -A -B.

I was able to recover my deleted file completely. The
operation was a success!

Recovery of the file was done with 2 basic steps using
the Vim editor:

Trim the file to its proper length
by getting rid of extraneous lines at
the beginning and end of the file
Fix and corruption that has occurred
in the middle of the file

Amazingly my deleted file was only slightly
corrupt. It had been corrupted by null characters.
That is to say, it appeared to have been overwritten
in one place by characters that are zero (absolute
zero) on the ASCII table. By zero, I mean all eight
bits of the character byte turned off.

Amazingly, only one line of my deleted file was
corrupt. It just so happened that I had not been
editing that line or any line before it. Therefore
I had a backup copy of that one line because I had
a backup copy of the deleted file with an earlier
timestamp.

In other words, I had not deleted all copies of
the file, just the version of the file that
represented the last 45 minutes of my work.

Recovering the one corrupted line was easy. I
simply relied on an earlier version of the file
which had never been deleted.

Was it worth it? It probably was not worth
saving the 45 minutes of work. I may have
spent close to 45 minutes of my time recovering
the file.

However, as an intellectual exercise it was
worth it. Now I know how to recover a deleted
file under Linux.

The big lesson, however, is don't be careless
when deleting files. Had I thought about it
15 more seconds, I would not have deleted the
file.

The life lesson is that rushing gets you nothing
or next to nothing.

Ed Abbott

Friday, July 15, 2011

Linux Version Information

I can't believe there is no general
command that tells you what version of Linux
you are running. I suppose this is because
Linux is based on Unix and when Unix started,
there was no other version of Unix. If there
is only one version of Unix in existence, there's
not all that much to find out about.

Now times have changed. There are many versions
of Unix out there. Or more to the point, there
are many versions of Linux out there.

Here's an article that describes where to get
Linux version Information for your particular
distribution:

Determining Linux Version

Here are 3 simple steps that should
give you Linux version information:

cd /etc
ls *release
Observe the file
name that suggests your
version of Linux
cat filename

I suppose that these steps are simple
enough that no one has bothered to
umbrella all of this under one command.

Ok. I just tried the above steps and
they did not work. It seems that the
word release is not as universal
as the above web page would suggest. It's
probably outdated.

I'll try the above steps with the
word version instead:

cd /etc
ls *version
Observe the file
name that suggests your
version of Linux
cat filename

Ok. For my Debian release, the
word version works. Here's
what my screen looks like:

myprompt:/etc$ ls *version
debian_version
myprompt:/etc$ cat debian_version 
5.0.8
myprompt:/etc$

I'm using the word version
in a loose way here. I really mean
the release in combination with the
distribution. However, that's too
clumsy to say over and over again.

There's a lot of fiction in brevity.

Ed Abbott

Tuesday, January 18, 2011

Httrack Backs Up Your Website

There's a command that you can install
under Linux. It is called Httrack. It
will backup your website, or any other
website for that matter.

Httrack is called an offline browser.
It only backs up files that are availabe
to a web browser. For this reason, it is
not a great tool for formal website backups.
It's intended more for casual backups.

In many cases, though, Httrack can
be quite useful. Say, for example,
you've been asked to backup a website
by the copyright owner for that website.
For whatever reason, the copyright owner
is unable to get an FTP password
for the website. This can happen if the password is held by someone who is
hostile to the copyright owner.

In this case, the parts of the website
that require no special programming can
be backed up with Httrack. Many websites
are like this. They have no backend
programming that needs to be backed up.
These websites consist of nothing but
purely informational web pages.

The great advantage of Httrack is
its simplicity. Here's the command
to backup up a simple website:

httrack http://www.mywebsite.com/

It's a wonderful wonderful thing to
have tools that scale to the size of
the problem. If you only want a very
very simple backup, why not use a very
very simple tool.

Of course, Httrack has many command
line options too. With these command
line options, you can extend the
capabilities of Httrack.

Ed Abbott

Tuesday, January 4, 2011

The Linux su Command

I have always assumed that the
su command stood for switch
user. After doing a little reading,
I find that the actual meaning is
substitute user. That's pretty
close to the same thing.

I've made many many false assumptions
regarding acronyms in the past. For years
I thought the pwd command stood
for print working directory. Then
I read it stands for present working directory.
Recently I read that it really is
print working directory. Go figure.

Maybe it is not so important what these
commands stand for as what they do.

The su command indeed lets you
switch from one user to another. Most
often, I find myself switching to the
root user.

Here are 4 different ways to switch to
the root user:

su
su -
su root
su - root

The first way is very simple. You
switch to root. but many variables,
which are exported into your new
shell environment, remain the same.

Here's an example of something that
will stay the same if I do as in the
first example where I type the su
command without the hypen:

---
me=ed
export me
echo $me
ed
su
Password:
echo $me
ed
---

Notice that I've set a
variable called me
to my first name, ed. Next,
I export the variable. I find
the variable still has my
first name in it after I've
logged in as root.

Let try the same thing with
the su - command. This
time, the variable is not exported:

---
me=ed
export me
echo $me
ed
su -
Password:
echo $me

---

See the difference? If I type
expert $me, I end up with
an empty variable and an empty
line. The difference is that I
type su with a hyphen in
one case and without a hyphen
in the other.

The hypen basically says, behave
as if I logged in as root. The
absence of the hypen says, I
wish to be root but I'd like to
retain as much of my old environment
as possible.

The 3rd and 4th choices shown above
are the same except that the default
root user is explicitly stated. Of
course, you can explicitly state a
user other than root if you wish.

I switch user to root on average once
a day or so. It is something I frequently
need to do.

If you use Linux on a single-user computer,
like I do, it makes sense to switch user
to root often. There's no one here but
myself to administer my system so that's
the way it needs to be.

The lesson of the su command seems
to be to only give yourself as much authority
and power as you need to get the job done. To
overuse or over-reach your authority can lead
to unintended consequences, especially if you
are being careless.

That's the beauty of Unix. You need only give
yourself the authority you need at that moment
and no more.

One more thing I should mention about the using
su with a hypen. If you use the hypen,
then the login shell for the user that you are
switching to is run. Running the login shell
mimics the user you've switched to almost perfectly.

I'd summarize it this way:

su without the hypen mimics the
permissions of the new user but retains
much of the environment of the old user
su with the hypen mimics the
permissions and the environment of
the user you have newly switched
to as completely as possible

I suppose it is the difference between
fully immersing yourself in something
and retaining a little bit of your old
self.

In life, some tasks require very little
of you and so the tendency is to multi-task
and be more than one person in the same
time frame. Other tasks require your full
attention and your full immersion.

The command with the hypen, su -,
is the full immersion version.

Ed Abbott

Friday, October 29, 2010

Finding Files On Your Hard Drive
With the Linux Find Command

I love the Linux find command. The
find command is used to find files.

Here are some of my favorite things about
the find command:

You can use it to find a file by name
You can use wildcards with it
It automatically descends into folders
underneath the current folder
It prints out the path to every file
it finds

The ability of the find command to descend
into directories (folders) is known as recursive
descent. Each layer of directories found under
the current layer of directories is another layer
of recursion. Recursive descent is a well-known
computer algorithm used by many programmers.

Basically, the find command consists of
4 parts:

The name find
Where to start looking
What files to look for
What to do when you find files

Here's an example:

find . -name abc -print

Here's the 4 parts of the above
find command:

The name of the command is find
Dot is the name of the current directory
We are looking for a file called abc
Once a file called abc is found, the
find command will print the path to it

Here's what happens in plain English:

We start looking in the current directory
(dot or period) for a file called abc.
We will uncover all possible sub-directories
of the current directory. Any files found
that are called abc will be printed.

Here's the only thing that is tricky about
the find command: It shares wildcards with
the shell. This can be trickier than it
sounds. Let's say I wish to find a file
that starts with the letters abc.
I might type the following command:

find . -name abc* -print

This will probably work. As long as
there are no files in the current
directory that start with an abc,
all will be well.

However, lets say we have a file called
abcdef in the current directory.
We are now in trouble. We are in trouble
because the shell is going to do file-name
expansion prior to executing
the find command.

Here's what we type:

find . -name abc* -print

Here's how the shell interprets what we
typed:

find . -name abcdef -print

Do you see the problem? The find command
never sees the asterisk. What happens is that
the find command sees abc* only
after it has been expanded to abcdef.
Big difference!

Of course, there is a way around this and that
is to remove the special meaning of the asterisk
with a backslash. Here's what this would look like:

find . -name abc\* -print

In actual practice, though, the practice of using
backslashes on a command line is very clumsy. Most
people use double quotes instead. Here's what double
quotes look like:

find . -name "abc*" -print

The double quotes escape any special meanings
including the special meaning of the asterisk.
Now we can rest easy and know that our filename
expansion characters will reach the
find command untouched.

The double quotes are a wonderful habit to get
into. Basically, you can use the double quotes
regardless of whether you are using filename
expansion characters or not. Let's say, for
example, we are looking for a file called
abc.

Here's how I might apply the double quotes:

file . -name "abc" -print

In this case, the double quotes do not
matter. Since there are no filename
expansion characters, the double
quotes serve no purpose.

Here's why I use double quotes anyway:

If you always use double quotes, you
never need rethink the find command.
It just works no matter what. Rather
than think whether double quotes are needed,
just use them. They don't cost anything
other than 2 keystrokes.

This is more valuable than it might
appear. When you are in the heat of
battle and you are trying to solve
a problem, considering whether or not
to use double quotes is a mental
distraction.

Rather than suffer the distraction, just
use the double quotes. It's not hard to
figure out whether or not you need double
quotes, but why think about it at all?

Ed Abbott

Tuesday, October 12, 2010

The diff Command Under Linux

The diff command under Linux
is one of my favorite Unix commands.
It's an ancient command that's still very
useful in a modern world. I call any command
that dates back to the 1970's ancient.

I was recently asked by someone over the
phone how to find the changes made to a
website by a web developer. How do you
find their changes if you have a complete
copy of the website before the changes and
a complete copy of the website after the
changes?

I told him that you need 3 things to do
this:

A complete copy of the website
before and after
The Unix ls -lt command
The diff command

Start by looking at the complete copy
of the new website. Start in the topmost
directory (folder) of the website and do
this command:

ls -lt

This will give you a list of both files
and directories sorted in timestamp
order. Directories recently modified
need to be investigated further. Files
recently modified need to be noted.

In any case, both files and directories
of recent vintage will rise to the top
of the ls -t listing.

Using this list, you can easily find things
that have been modified after the web developer
(who made changes) took over.

If a file, make a note. If a directory, look
further.

Keep looking into directories that have been
modified since the new web developer started
working on the site. Once you've found all
the files that have been modified after a certain
date, you are done with ls -lt.

This will take less time than it might seem as
web developers typically only modify a few files
on each occasion that they work on a site. For
example, if the web developer only worked on the
Contact Us page, this may be the only file
that was modified. This being the case, you will
find the file relatively quickly.

Next, use the diff command to figure out
what changed on the Contact Us page.

Here's how you might use the diff command
hypothetically:

diff ../old/contactus.html ../new/contactus.html >temp

I've fictionalized the directories where the
old and new Contact Us pages would be
found. Undoubtedly, you will have to do a bit
more typing than I did in my hypothetical example
above to get a diff on the two files.

Notice that I've placed the difference between
the two files in a file called temp. This
is a temporary file that has all the changes.

If the changes are not too extensive, the file
called temp will be quite short. It could
be something as simple as a new phone number or
a new business address.

A Contact Us page consists of contact
information so the changes to it would not
necessarily be anything more than a slight
update.

How long would it take me to find all of this
out? Discounting the time it takes me to obtain
two copies the the two websites, I'd say maybe
5 minutes.

Here's the steps I would take in that 5 minutes:

Find the most recent timestamp in the old
copy of the website. In other words, do a
ls -lt on the old topmost directory. Be
sure to discount things like server logs and
other things that are automatically updated
Use the timestamp discovered at the old site
to determine what is new at the new site
Do the steps given above to discover what
files are newer than the timestamp discovered
on the old copy of the website

That, in a nutshell, would be how I would discover
work done recently by a web developer. Here
are some basic principles that are at work here:

In life you generally need a reference point
if you are to get anywhere. In this case the
reference is the file last worked on on the old
copy of the website
In life, it is helpful to know how far you've
come since you last referenced where you were. The
technique of using ls -lt to progressively
descend directories looking for recent file changes
to the new copy of the website does this. It tells
you have things have progressed since the last
checkpoint
It helps to have a basis of comparison. The
diff command gives you a wonderful way to
compare two files looking for changes

Because of their primitive nature, I don't know
of anything that supersedes the old Unix command-line
commands. I've never ever discovered anything that
is quite like them in flexibility, scope, and power.

Of course, it takes a little bit of creativity to
combine and use these commands effectively. If there
is a downside, that would be it. You cannot be half
asleep and use Unix commands effectively. You have
to be a person who does not mind exercising a little
creativity. If you enjoy being creative, Unix command-line
commands may be for you.

One more thing: I've oversimplified things somewhat
to help you find files on a small website. On a
large website you may have problems with the
techniques outline above.

While the parent directory of a file will reflect the
most recently modified file in the parent directory, the
grandparent directory will not necessarily reflect this
same timestamp. In other words, the topmost directory
of a website does not necessarily reflect the most recently
modified file on the website. That's one problem.

Another problem is if you have to go digging through many
many files and directories. If this is the case, the
technique outlined above could prove difficult or impossible.

In the case of added complexity, you may wish to change
technique somewhat. The following web page describes how
to use the find command to find a file of recent
vintage:

Using the -newer option of the find command

Even though the find command is more efficient
if you absolutely need to know the most recently modified
files on a website, the ls -lt command is still
useful. The ls -lt is a much quicker and simpler
way to survey the general situation and get a general
take on how recently the website has been modified.

There's another principle at work here and that's scaling
your solutions to your problems. In life you generally
don't want to use a large-scale solution to solve a small
problem. You typically would not use a backhoe to dig
a hole for a fence post.

So, depending on the scale of what you are looking at,
either find -newer or ls -lt may be the
ideal tool to pick up and use in your situation.

Ed Abbott

Sunday, September 26, 2010

Feeding Standard Input
to The bc Command

I'm always learning something
new.

For years I've used the
bc command in interactive
mode. bc stands for
basic calculator.

Here's a nice blog post that
introduces bc:

Unix Basic Calculator

Interactive mode is fine. That's
where you sit there and type things
like this:

1 + 1 + 1

After you hit enter, you
get this answer back:

That's great if you are doing
a simple calculation. But what
about lots and lots of numbers.
How do you deal with this?

This morning, I was working with
the vim editor. I was pulling
some numbers out of a web page. I
wanted to add all these numbers
together like this:

24 + 7 + 9 + 27

It was a lot more numbers then
I'm showing you here. There were
about 52 numbers embedded in a web
page that I wanted to add together.
That's way too many numbers to add
together interactively. What if I
made a typo?

Also, why should I have to type the
numbers at all. That's what vim
is for, right? In this case, I used vim
to format the numbers for me and then
put plus signs (+) in between each number.

After I was finished editing the web page
with vim, I was left with a single line
that consisted of nothing but numbers separated
by plus signs (+).

Vim is my all-time favorite programmer's
editor:

My Favorite vim Commands

With vim, I used regular expressions
and vim commands to pare my saved
web page down to just 52 numbers on one line
with a plus sign (+) separating each number.

The next step? I was wondering about that.
I figured there must be a way to use bc
in batch mode. By batch mode, I mean
a way to get bc to run a bunch of commands
that are stored in a text file.

Turns out there is a way. I discovered
it when I read the bc man page. It's
simple. Just put the math operations in
a text file and feed these numbers to bc.

Here's what I typed:

bc <numbers.txt

It worked like a charm! My 52 numbers
were all added together. This is a very
powerful feature that I'm sure I'll be
making great use of in the future.

The simplicity of this approach is that
you feed your batch file into bc
via standard input and you get your
answer on standard output. In other
words, a Unix filter.

Here's the input and output together:

bc <numbers.txt
256

I love Linix for this reason. So many
primitive commands that you can do such
great things with. Linix is a great time-saver.

Ed Abbott

Saturday, May 22, 2010

The Unix-Linux Identify Command

Another great command discovered!

The identify command enables
me to dicover the width and height
of an image without having to go
into an image viewer.

This is very handy if, say, I'm in
the Vim editor and I need to know
the dimensions of an image without
leaving the editor.

Here's what I might type in Vim:

:r !identify goofy.jpg

The above command pulls into Vim
the width and height of the
image called goofy.jpg.

This is another example of how a
primitive command can be put to
great use when combined with other
tools.

The old cliche is a cliche because
it is so true: The sum is greater
than the parts.

Ed Abbott

Monday, March 22, 2010

Spell Check Web Pages Easily With Ispell

I use a command-line utility
called ispell to spell
check my text files under linux.

Here's an example of what I mean:

ispell readme.txt

The above command will go through
the readme.txt file looking
for misspelled words. It will
offer suggestions as to how the
word might be spelled correctly.
Should you choose to accept the
suggested corrected spelling, it
will correct the spelling of the
word for you with a single keypress.

How about web pages that you are
working on locally on your hard
drive? How do you spell check
these?

Of course, you could upload the web
page to the web server and then
spell check it on the web. That's
one way of doing things.

However, what if you want to spellcheck
locally? What if you'd like to spellcheck
your text that is marked up with HTML
without having to go to the web?

Here's a sample command that demonstrates
what I do in this case:

ispell -h index.html

In the above example, index.html
is on my hard drive in the current
directory. I add the -h option
to let ispell know that I want
it to ignore HTML markup and only
spellcheck the body of text itself.

This is very very handy if you work in
a simple text editor but wish to
spellcheck without having to go to
the web to find a spellcheck application.

A favorite feature of mine is ispell's
ability to respond on one keypress. One
keypress gets you many things.

One of my favorite keypresses is the
letter i. The letter i allows you
to add a word to your own personal
dictionary. Here's where your personal
dictionary is stored in a hidden
file under your home directory:

~/.ispell_default

Once a word has been stored in the
.ispell_default file under your
home directory, it becomes a regular
word that ispell now considers to
be correctly spelled. It will even
suggest that word from time to time
should you come up with a misspelling
that is an approximation of the correct
spelling.

Words from ispell's built-in dictionary
and words from your personal dictionary
are both first-class citizens. Words
from both sources are likely to be suggested
as possible correct spellings.

How does ispell suggest words? It's
one single keypress all over again.
ispell might suggest 10 different
spellings. The suggestions will appear
as keypresses 0 through 9.

Let's say ispell suggests 36 different
spellings. In that case, the choices
will range from 00 through 35. Thus
two keypresses will be required to make
a choice.

Normally, though, only one keypress is
needed. Ten suggestions or less is typical.
More than ten suggestions is the exception.

Ed Abbott

Monday, March 1, 2010

Linux cp Command

One of my favorite commands under
Linux is the cp command. In
it's most primitive form, you use
it to copy a file like this:

cp strawberry raspberry

With the cp command, you
don't make extra work
for yourself. Instead
of recreating a file, you
copy it.

In the above example, I copied
strawberry to raspberry.
After I've done this, I should have
two identical files.

Note that the copy command is not
limited to files. Here's how you
copy a directory and all its
contents:

cp -R banana apple

In the above example, banana
is the original directory. The
new directory is apple.

Copy always goes in this direction:

cp old new

You always copy left to right. Another
way of saying this is that you always
copy an old file to a new file, the
old file being on the left, the new
file being on the right.

Back to cp -R. Here's the example
I gave above:

cp -R banana apple

In this example, I recursively copy
a directory called banana into a new
directory called apple. I started
with banana but I ended up with both
banana and apple.

Note that it is not just the directory that
is copied. It is also all the contents of
the directory, including other files and
directories to any level of depth.

If you are used to using the term folder,
think of a directory as a folder. Folders
and directories are the same thing.

I use the term directory because the
cp command is used on the command line.
On the command line, folders are directories.

Here's one more example of the cp command:

cp -Rp pear peach

In this example, the directory pear is
being copied to the new directory called peach.
However, there is an additional nuance here. The
directory peach may be new but its timestamp
is old. That's because the -p option asks
copy to preserve both permissions and timestamps.

Had we left off the -p option, peach
would have a different timestamp. The timestamp
would be the moment peach came into existence.

Also, peach could potentially have a different
owner as well. With the -p option, ownership
defaults to the person who typed and executed the
cp command.

Again, the -p option preserves both permissions
and timestamps.

Knowing the cp command can save you
much time and energy. This is especially
true if you know the many different ways in
which it can be used.

Ed Abbott

Thursday, January 14, 2010

The Unix ls Command Sorted by Date

In the past, I've written
about using the sort
command to sort long
listings by date.

There's an easier way, I've
just learned.

The ls command has a
-t option. You
can use this to sort files by
timestamp.

It looks like this:

ls -t

If you wish to confirm that
the files are really being
sorted by date, you could
turn it into a long listing:

ls -lt

A long listing will not tell
you if files that are more
than 6 months old are in
precise date order to the
minute and to the second.

Here's yet another way to do
it:

ls -t --full-time

The above command will give a
long listing, but with the
modification time in hours,
minutes and seconds included.

Of course, it is the -t that
sorts the listing in timestamp
order.

I believe that the --full-time
option is of recent vintage. I don't
think we ever had it in the old days.

In fact, the only ls that I know
of that has a --full-time option
is GNU ls.

However, since I use Linux currently,
this is not a problem for me. I suspect
that the ls command under Linux is,
in most cases, GNU ls.

What's the lesson in all of this?
Orderliness and Godliness are related.
Commonplace things take on a Godly nature
when then are put into good order.

Just the other day, I used ls -t
on a directory of sub-directories. I was
trying to find the directory I had most
recently worked on. I could not find it.

I was so confused. What could possibly be
wrong? I thought maybe the ls -t was
broken.

Turned out I was logged into linux as
another user and had forgotten that fact.
I had forgotten I was logged into an
account other than my normal user account.

As it turns out, both user accounts have a directory
tree that is a mirror in terms of directory names but
not in terms of file content. However, to all
appearances, the name of the directory and the names
of the sub-directories are the same.

The confusion I felt had over the ls -t not
working later cleared up when I realized I was in
the wrong file hierarchy entirely.

This is what I mean by orderliness and Godliness being
related. A clear mental vision and a clear spiritual
vision often are the result of living an orderly life.

While the ls -t cannot order my entire life,
it can take the chaos out of a small corner of it.

Ed Abbott

Wednesday, January 6, 2010

The Unix du Command Under Linux

One of my favorite Linux commands
is the du command.

Here's an article:

The du Command

Why do I like the
du command?
Because it tells me
if I'm being a disk
hog. That's why.

Also, it helps me
to identify disk hog
directories.

As mentioned in the article,
there's a command that will
give you a summary on a specific
directory. Here it is:

du -hs

Note that you have to have permissions
on all the directories below this current
directory or you end up getting a lot of
goofy error messages.

Perhaps the quickest and easiest way to
get around the goofy error messages is
to login as root.

Note that the above command will only
give you one line of output.

Here's an input and output example:

$ du -hs mary
363M mary
$

Note that the final dollar sign is
my prompt coming back.

In this case, I'm being told that
the directory called mary
has used up 363 megabytes of
storage.

Ed Abbott

Sunday, January 3, 2010

The Unix df Command Under Linux

One of the commands I use the
most frequently under Linux is
the df command.

df stands for disk
free. Basically, the
df command tells you
how much of your hard drive
has been used up.

This is very useful.

With df, you know whether
you have 50 percent of your hard
drive left for additional storage
or only 10 percent left.

Big difference.

Again, df means disk
free. This is exactly what
df tells you. How much
of the hard drive is free for
you to add other things to it.

Here's how to use the df
command:

df

Simple, isn't it?

Here some input and output:

$ df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/hda3            123079960  83666812  33161052  72% /
tmpfs                   518136         0    518136   0% /lib/init/rw
udev                     10240       672      9568   7% /dev
tmpfs                   518136         0    518136   0% /dev/shm
$

The final dollar sign is my prompt
coming back.

You can make the df command
quite a bit more useful by adding
the -h option to it.

The -h means human
readable. With -h, df
reports usage in megabytes
and gigabytes, and other units of
measure that are easy to digest.

Here's what df looks like with
-h present:

df -h

Here's a sample input and output for
df with the -h present:

$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/hda3             118G   80G   32G  72% /
tmpfs                 506M     0  506M   0% /lib/init/rw
udev                   10M  672K  9.4M   7% /dev
tmpfs                 506M     0  506M   0% /dev/shm
$

Again, the final dollar sign
is my prompt coming back.
The initial dollar sign is the
prompt as it appears before
I've typed anything.

Note that df gives me 6
columns of information.

Perhaps the most important column
is the second column. This is the
size column.

This tells me, in megabytes or
gigabytes, how much space I have
left on my hard drive.

In my case, I have used up 80
gigabytes of storage and have
only 32 gigabytes left.

To see this, look at line one and
ignore the other lines.

Like many Unix commands, df
gives you more information that you
want initially.

Likely as not, the information you
will want from df is all on
the first line. At least, that's
true in this case.

Ed Abbott

Monday, December 21, 2009

The Unix Grep Command Under Linux

I love the grep command.
It is so useful!

grep can be used to find
useful information inside of
a file.

For example, if I'm looking for
a name in my address book, I might
use grep. That's one way
to do it.

Here. Let me look for my own name
as an example:

grep -i Abbott AddressBook

The above grep command looks
for every occurance of the word
Abbott in my AddressBook
file.

I've asked grep to ignore
case sensitivity. That is to say,
I want to find Abbott, which
is capitalized, or abbott, which
is uncapitalized.

The following grep option ensures
that case sensitivity is ignored:

-i

What does grep do? It prints
all lines that have Abbott in
them on my screen.

Here's the command again:

grep -i Abbott AddressBook

Here's the output:

@Abbott, Ed

That one line of output.

If I want more, I can ask
the grep command to
print more.

For example, let's say I
want 3 lines of context.

Here's how I would specifiy
3 lines of context:

grep -i -c 3 Abbott AddressBook

The above command should give
me 7 lines.

Three lines before
The line with Abbott in it
Three lines after

This could be handy as my name in my
address book could immediately be
followed by my phone number.
Sometimes, context is helpful.

My examples here are somewhat
contrived. In reality, I use
the grep command for other things.

For example, I keep a journal
on my laptop computer. If I want
to look up a specific topic, such
as basketball, which may span
more than one journal, I can do it
this way:

grep -i -C 3 basketball *

This simple command will look at
every journal in the current directory
and pick out 7 lines of context for
each mention of basketball.

This is very helpful if you are trying
to find information fast.

The -C option of grep
is actually part of a family of options.

This family gives context. Here's the
family:

-A	after context
-B	before context
-C	before and after context

Note that in this family of context options,
each member of the family is mnemonic for
something.

Here's the same table expressed as a memory
aid:

-A	after
-B	before
-C	context

The context options, as I call them,
always print context lines in addition
to the line specified.

I'll give an example.

Let's say I have a file called Numbers
that consists of the following ten lines
of text:

ONE
TWO
THREE
FOUR
FIVE
SIX
SEVEN
EIGHT
NINE
TEN

OK. Now let's say I
type the following command:

grep SIX Numbers

The grep command will
go looking for the pattern
SIX in the file. When
it finds it, we get the following
input/output:

$ grep SIX Numbers
SIX
$

The final $ is just my
command line prompt being returned
to me.

Notice that grep defaults
to printing just one line. That's
important! This helps you to understand
the behavior of the context options.

All context options print lines that
are in addition to the one line that
grep defaults to.

Here's a chart of how many lines each
context option will print:

-A 3	4 lines total
-B 3	4 lines total
-C 3	7 lines total

Here's a chart that describes what each
option does:

-A 3	print the line plus 3 lines after the line
-B 3	print 3 lines before the line plus the line
-C 3	print 3 lines before the line, the line, and 3 lines after the line

Ed Abbott

Tuesday, December 1, 2009

The Ping Command

Ping is very useful if you
wish to lookup an IP address
for a website. For example:

ping www.websiterepairguy.com

OK. I just did a ping on my
own website.

Here's one of the lines of output:

64 bytes from box458.bluehost.com
   (74.220.219.58): 
   icmp_seq=1 ttl=52 time=86.0 ms

Ping will keep pinging away and
produce many many lines of output
that looks like the above.

Here's how to limit ping to one
line of output only:

ping -c 1 www.websiterepairguy.com

OK. Now my website gets pinged once
and then ping quits. In most cases,
this is what I want. I only need to
ping once.

Here's the total output for pinging
my website just once:

PING websiterepairguy.com 
    (74.220.219.58) 56(84) 
    bytes of data.
    64 bytes from 
    box458.bluehost.com 
    (74.220.219.58): 
    icmp_seq=1 ttl=52 
    time=90.2 ms

--- websiterepairguy.com 
    ping statistics ---
    1 packets transmitted, 
    1 received, 
    0% packet loss, 
    time 0ms
    rtt min/avg/max/mdev = 
    90.219/90.219/90.219/0.000 ms

Now let's say I'm only interested in
one thing and that is the IP address
for my website.

In this case, I'm only interested in
one line of output, the line that reveals
my IP addresss.

If I pipe the ping command to the grep
command, it might look like this:

ping -c 1 www.websiterepairguy.com | 
     grep PING

OK. The grep command narrows the amount
of output I get to just one line and only one
line of output. Here's that one line:

PING websiterepairguy.com 
     (74.220.219.58) 56(84) bytes of data.

How does grep work? It is a line printer
but a very special line printer. It only prints
lines that match what you gave it.

Since I gave it a capital PING as a pattern
to match, it only prints lines that match this
pattern.

Now lets narrow the output further. Let's
go for my domain name and my IP address only.
Here's the command:

ping -c 1 www.websiterepairguy.com | 
     grep PING | cut -f2-3 -d " "

Note the addition of the cut command.
I'm cutting out fields two and three based
on the spacebar as my field delimator.

Delimators are what define fields. They are
field separator characters.

I've placed a spacebar character between double
quotes to indicate that is my delimator. Here's
what it looked like when I did this:

-d " "

I only want the second and third field. Here's
how I specified this:

-f2-3

Here's the whole command again:

ping -c 1 www.websiterepairguy.com | 
      grep PING | cut -f2-3 -d " "

Here's the output to this command:

websiterepairguy.com (74.220.219.58)

OK. The only undersireable that remains
is the parenthesis. Here's a tricky way
to get rid of these:

ping -c 1 www.websiterepairguy.com | 
     grep PING | cut -f2-3 -d " " | 
     sed s/[\(\)]//g

In this case, the syntax for the sed
command is not very pretty. Here's what the
sed looks like:

sed s/[\(\)]//g

The sed command is being asked to substitute
one thing for another. In this case, it is being
asked to substitute parenthesis for nothing.

More simply, take the parenthesis out, both left
and right, and replace them with nothing.

The parenthesis appear with a backslash in front
because without the backslash, the parenthesis take
on a special meaning.

The sed command is using its own substitution
operator. Here's how the sed command works:

sed s/before/after/g

before is replaced by after

Here's the whole command again:

ping -c 1 www.websiterepairguy.com | 
      grep PING | 
      cut -f2-3 -d " " | 
      sed s/[\(\)]//g

Here's the output:

websiterepairguy.com 74.220.219.58

Ok. Seems like a lot of code but
actually, it is a quick way to get
things done, once you know what you
are doing.

Ed Abbott

Friday, November 20, 2009

The Unix ls Command

One of my all-time favorite
Unix commands is the ls command.

In its simplest form, ls looks
like this:

ls banana

The above lists a file named banana
if there is such a file.

Here's one that's more useful:

ls

This lists all the files in the current
directory.

OK. Here's something that could be even
more useful:

ls -l

This is the so-called long listing.
It lists eight columns of information on
every file in the current directory.

Included in the long listing is a time-stamp
for the file.

Note that a directory is just simply a
folder. Folders are directories and
directories are folders.

People who work on the command line
call them directories. People who
use a GUI (Graphical User Interface)
call them folders.

OK. Here's one that lists hidden files:

ls -A

The -A option means all.
That is to say, list all, both hidden
and non-hidden files.

Here's one that does a long listing on
all files, including hidden files:

ls -lA

Here's one that will look for subdirectories
inside of directories. Not only does it list
the current directory, it also lists anything
that belongs to the current directory:

ls -R

The -R suggests that the command is
recursive. It recursively descends into
directories finding sub-directories under
those directories going just as deep and
as far as it can.

Here's one that does a recursive long listing
on everything from the current directory on
down:

ls -lAR

Here's one that potentially looks at thousands
of files, listing those most recently modified
last:

ls -lAR | sort -k6

Use the above command to find a needle in a
haystack. The needle? A recently modified
file that is buried somewhere in listings of
thousands of files.

I love Unix commands because they are so simple.
Yet, you can do very complex things with them
when you start to string them together.

Ed Abbott

Thursday, November 19, 2009

The Unix File Command

Another favorite Unix command
that I frequently use is the
file command.

The file command is great
because it allows you to easily
determine what kind of file you
have in front of you.

Here's how you use the file
command:

file mystery-file

The file command tells you
what kind of file mystery-file
is.

Often, you can tell a file by extension.

For example, .jpg is a file extension.
The file, photo.jpg, is one such file.

To those in-the-know, any file with a
.jpg extension is an image or
photograph.

But what if you don't know? It's hard to keep
up with all possible file types out there.

If you don't know, the Unix file command
can be very very handy.

Ed Abbott

Thursday, November 12, 2009

The Unix Sort Utility

OK. This is a new blog.

Here's where I talk about some
of my favorite Unix commands under
Linux.

One of my favorites is the sort
utility. Often, I use this one
with a pipe.

Here's an example:

ls  -l  |  sort  -k6

This one sorts the output of a
long listing by date.

Therefore, if I'm only interested
in files that have been worked on
recently, I will find these at the
end of the listing.

Note that the -k option for
sort picks a whitespace
separated field as the one to do
the sort on.

That is to say, whitespace is what
separates potential sort keys.

Let's see. The long form of the
ls command is ls  -l.
ls  -l has 8 fields.

Therefore, ls  -l has 8
potential sort keys that you can
sort on.

A sort key is something to sort on.
A filename could be a sort key. A
date could be a sort key.

It is the sixth field that contains
the date. Therefore, -k6 means
sort on the sixth field which is the
date field.

Ok. Putting it all together, I'm taking
an ls command and sending
it to a sort command.

Putting the two commands, ls
and sort, together gives me
something greater than either command alone.

In many ways, that's what Unix is all about.
Putting commands together to make something
greater.

In this case, it is the pipe symbol, which lies
between the two commands, that allows me to put
them together to make something greater.

Make sense?

Here's a wonderful web page that describes in
detail many of the sort utility options:

More About Sort

Ed Abbott