Sunday, March 23, 2008
12:19 AM

Using sed to extract lines in a text file

If you write bash scripts a lot, you are bound to run into a situation where you want to extract some lines from a file. Yesterday, I needed to extract the first line of a file, say named somefile.txt.
$ cat somefile.txt
Line 1
Line 2
Line 3
Line 4


This specific task can be easily done with this:
$ head -1 somefile.txt
Line 1


For a more complicated task, like extract the second to third lines of a file. head is inadequate.

So, let's try extracting lines using sed: the stream editor.

My first attempt uses the p sed command (for print):
$ sed 1p somefile.txt
Line 1
Line 1
Line 2
Line 3
Line 4


Note that it prints the whole file, with the first line printed twice. Why? The default output behavior is to print every line of the input file stream. The explicit 1p command just tells it to print the first line .... again.

To fix it, you need to suppress the default output (using -n), making explicit prints the only way to print to default output.
$ sed -n 1p somefile.txt
Line 1


Alternatively, you can tell sed to delete all but the first line.

$ sed '1!d' somefile.txt
Line 1


'1!d' means if a line is not(!) the first line, delete.

Note that the single quotes are necessary. Otherwise, the !d will bring back the last command you executed that starts with the letter d.


To extract a range of lines, say lines 2 to 4, you can execute either of the following:

  • $ sed -n 2,4p somefile.txt
  • $ sed '2,4!d' somefile.txt


Note that the comma specifies a range (from the line before the comma to the line after).

What if the lines you want to extract are not in sequence, say lines 1 to 2, and line 4?
$ sed -n -e 1,2p -e 4p somefile.txt
Line 1
Line 2
Line 4


If you know some different ways to extract lines in a file, please share with us by filling out a comment.

P.S. Related articles from this blog:


StumbleUpon Toolbar

0 comments:

Post a Comment