Monday, December 21, 2009

The Unix Grep Command Under Linux

I love the grep command.
It is so useful!

grep can be used to find
useful information inside of
a file.

For example, if I'm looking for
a name in my address book, I might
use grep. That's one way
to do it.

Here. Let me look for my own name
as an example:

grep -i Abbott AddressBook

The above grep command looks
for every occurance of the word
Abbott in my AddressBook
file.

I've asked grep to ignore
case sensitivity. That is to say,
I want to find Abbott, which
is capitalized, or abbott, which
is uncapitalized.

The following grep option ensures
that case sensitivity is ignored:

-i

What does grep do? It prints
all lines that have Abbott in
them on my screen.

Here's the command again:

grep -i Abbott AddressBook

Here's the output:

@Abbott, Ed

That one line of output.

If I want more, I can ask
the grep command to
print more.

For example, let's say I
want 3 lines of context.

Here's how I would specifiy
3 lines of context:

grep -i -c 3 Abbott AddressBook

The above command should give
me 7 lines.

  • Three lines before
  • The line with Abbott in it
  • Three lines after

This could be handy as my name in my
address book could immediately be
followed by my phone number.
Sometimes, context is helpful.

My examples here are somewhat
contrived. In reality, I use
the grep command for other things.

For example, I keep a journal
on my laptop computer. If I want
to look up a specific topic, such
as basketball, which may span
more than one journal, I can do it
this way:

grep -i -C 3 basketball *

This simple command will look at
every journal in the current directory
and pick out 7 lines of context for
each mention of basketball.

This is very helpful if you are trying
to find information fast.

The -C option of grep
is actually part of a family of options.

This family gives context. Here's the
family:

-Aafter context
-Bbefore context
-Cbefore and after context

Note that in this family of context options,
each member of the family is mnemonic for
something.

Here's the same table expressed as a memory
aid:


-Aafter
-Bbefore
-Ccontext

The context options, as I call them,
always print context lines in addition
to the line specified.

I'll give an example.

Let's say I have a file called Numbers
that consists of the following ten lines
of text:

ONE
TWO
THREE
FOUR
FIVE
SIX
SEVEN
EIGHT
NINE
TEN

OK. Now let's say I
type the following command:

grep SIX Numbers

The grep command will
go looking for the pattern
SIX in the file. When
it finds it, we get the following
input/output:

$ grep SIX Numbers
SIX
$

The final $ is just my
command line prompt being returned
to me.

Notice that grep defaults
to printing just one line. That's
important! This helps you to understand
the behavior of the context options.

All context options print lines that
are in addition to the one line that
grep defaults to.

Here's a chart of how many lines each
context option will print:


-A 34 lines total
-B 34 lines total
-C 37 lines total

Here's a chart that describes what each
option does:

-A 3print the line plus 3 lines after the line
-B 3print 3 lines before the line plus the line
-C 3print 3 lines before the line, the line, and 3 lines after the line


Ed Abbott

Tuesday, December 1, 2009

The Ping Command

Ping is very useful if you
wish to lookup an IP address
for a website. For example:

ping www.websiterepairguy.com

OK. I just did a ping on my
own website.

Here's one of the lines of output:

64 bytes from box458.bluehost.com
   (74.220.219.58): 
   icmp_seq=1 ttl=52 time=86.0 ms

Ping will keep pinging away and
produce many many lines of output
that looks like the above.

Here's how to limit ping to one
line of output only:

ping -c 1 www.websiterepairguy.com

OK. Now my website gets pinged once
and then ping quits. In most cases,
this is what I want. I only need to
ping once.

Here's the total output for pinging
my website just once:

PING websiterepairguy.com 
    (74.220.219.58) 56(84) 
    bytes of data.
    64 bytes from 
    box458.bluehost.com 
    (74.220.219.58): 
    icmp_seq=1 ttl=52 
    time=90.2 ms

--- websiterepairguy.com 
    ping statistics ---
    1 packets transmitted, 
    1 received, 
    0% packet loss, 
    time 0ms
    rtt min/avg/max/mdev = 
    90.219/90.219/90.219/0.000 ms

Now let's say I'm only interested in
one thing and that is the IP address
for my website.

In this case, I'm only interested in
one line of output, the line that reveals
my IP addresss.

If I pipe the ping command to the grep
command, it might look like this:

ping -c 1 www.websiterepairguy.com | 
     grep PING

OK. The grep command narrows the amount
of output I get to just one line and only one
line of output. Here's that one line:

PING websiterepairguy.com 
     (74.220.219.58) 56(84) bytes of data.


How does grep work? It is a line printer
but a very special line printer. It only prints
lines that match what you gave it.

Since I gave it a capital PING as a pattern
to match, it only prints lines that match this
pattern.

Now lets narrow the output further. Let's
go for my domain name and my IP address only.
Here's the command:

ping -c 1 www.websiterepairguy.com | 
     grep PING | cut -f2-3 -d " "

Note the addition of the cut command.
I'm cutting out fields two and three based
on the spacebar as my field delimator.

Delimators are what define fields. They are
field separator characters.

I've placed a spacebar character between double
quotes to indicate that is my delimator. Here's
what it looked like when I did this:

-d " "


I only want the second and third field. Here's
how I specified this:

-f2-3


Here's the whole command again:

ping -c 1 www.websiterepairguy.com | 
      grep PING | cut -f2-3 -d " "


Here's the output to this command:

websiterepairguy.com (74.220.219.58)


OK. The only undersireable that remains
is the parenthesis. Here's a tricky way
to get rid of these:

ping -c 1 www.websiterepairguy.com | 
     grep PING | cut -f2-3 -d " " | 
     sed s/[\(\)]//g


In this case, the syntax for the sed
command is not very pretty. Here's what the
sed looks like:

sed s/[\(\)]//g


The sed command is being asked to substitute
one thing for another. In this case, it is being
asked to substitute parenthesis for nothing.

More simply, take the parenthesis out, both left
and right, and replace them with nothing.

The parenthesis appear with a backslash in front
because without the backslash, the parenthesis take
on a special meaning.

The sed command is using its own substitution
operator. Here's how the sed command works:

sed s/before/after/g


before is replaced by after

Here's the whole command again:

ping -c 1 www.websiterepairguy.com | 
      grep PING | 
      cut -f2-3 -d " " | 
      sed s/[\(\)]//g


Here's the output:

websiterepairguy.com 74.220.219.58


Ok. Seems like a lot of code but
actually, it is a quick way to get
things done, once you know what you
are doing.

Ed Abbott