Sunday, April 27, 2008
4:40 AM

Using awk to extract lines in a text file

awk is not an obvious choice as a tool for strictly extracting rows from a text file. It is better known for its column/field manipulation capabilities in a text file. More obvious choices are sed, and perl. You can see how sed does it in my earlier entry.

If you opt for awk, you can use its NR variable which contains the number of input records so far.

Suppose the text file is somefile:
$ cat > somefile.txt
Line 1
Line 2
Line 3
Line 4

To print a single line number, say line 2:
$ awk 'NR==2' somefile.txt
Line 2


If the text file is huge, you can cheat by exiting the program on the first match. Note that this hack will not work if multiple lines are being extracted.
$ awk 'NR==2 {print;exit}' somefile.txt
Line 2


To extract a section of the text file, say lines 2 to 3:
awk 'NR==2,NR==3' somefile.txt
Line 2
Line 3


A more interesting task is to extract every nth line from a text file. I showed previously how to do it using sed and perl.

Using awk, to print every second line counting from line 0 (first printed line is line 2):
$ awk '0 == NR % 2'  somefile.txt
Line 2
Line 4


To print every second line counting from line 1 (first printed line is 1):
$ awk '0 == (NR + 1) % 2'  somefile.txt
Line 1
Line 3


% is the mod (i.e. remainder) operator.

StumbleUpon Toolbar

0 comments:

Post a Comment