Sort tricks

This nice UNIX command allows you to sort the rows of a text file according to predetermined criteria. Let’s do a few simple examples.

Imagine you have a text file called Data.txt structured as below:

Anna     24      Italy
Joe      41      Oregon
Kate     16      UK
Phil     35      Netherlands

If you want to sort the rows according to the age (the second column), you can do it with sort -k 2 Data.txt obtaining this output:

Kate     16      UK
Anna     24      Italy  
Phil     35      Netherlands
Joe      41      Oregon

The -k 2 option means that the sort operation is done considering the values of the second column from left.

In the same way you can order the rows according to the names (first column), so the command now is sort -k 1 Data.txt and the result is this:

Anna     24      Italy  
Joe      41      Oregon
Kate     16      UK
Phil     35      Netherlands

You can also reverse the order by adding the -r option. So, if you consider the first example, the command becomes sort -r -k 2 Data.txt and the output will be:

Joe      41      Oregon
Phil     35      Netherlands
Anna     24      Italy  
Kate     16      UK

If you want to redirect the output on another text file instead of the standard output, you can use the -o option. In this case the command is something like this: sort -r -k 2 Data.txt -o DataSorted.txt.

Now, what if you have the same text file with an heading?


NAME     AGE     COUNTRY
Anna     24      Italy
Joe      41      Oregon
Kate     16      UK
Phil     35      Netherlands

In this case you must find a way to exclude the heading from the sort operation. Let’s consider the first example of this post.

A simple way to obtain this, is to use the following two commands:

head -n 1 Data.txt | cat > DataSorted.txt && more +2 Data.txt | sort -k 2 | cat >> DataSorted.txt

The first one extracts the first row (the heading) from Data.txt and writes it on a new file called DataSorted.txt.
The second one takes all the rows of Data.txt starting from the second one, then gives them in input to sort. Finally the sorted output is appended to DataSorted.txt.

The | operator is a pipe. It takes the output of a command and gives it as input to another one.

The && operator executes the second command (on the right side) only if the previous one succeeded.

The expected result will be:

NAME     AGE     COUNTRY
Kate     16      UK
Anna     24      Italy
Phil     35      Netherlands
Joe      41      Oregon

As suggested by a friend, you can also use tail in place of more:

head -n 1 Data.txt | cat > DataSorted.txt && tail -n +2 Data.txt | sort -k 2 | cat >> DataSorted.txt

For more details see the man pages of the cat, head, more and tail UNIX commands.

Did you like this post? Share it!
Share on email
Email
Share on twitter
Twitter
Share on facebook
Facebook
Share on linkedin
Linkedin
Share on print
Print

Leave a Reply