This nice UNIX command allows you to sort the rows of a text file according to predetermined criteria. Let’s do a few simple examples.
Imagine you have a text file called Data.txt structured as below:
Anna 24 Italy Joe 41 Oregon Kate 16 UK Phil 35 Netherlands
If you want to sort the rows according to the age (the second column), you can do it with sort -k 2 Data.txt obtaining this output:
Kate 16 UK Anna 24 Italy Phil 35 Netherlands Joe 41 Oregon
The -k 2 option means that the sort operation is done considering the values of the second column from left.
In the same way you can order the rows according to the names (first column), so the command now is sort -k 1 Data.txt and the result is this:
Anna 24 Italy Joe 41 Oregon Kate 16 UK Phil 35 Netherlands
You can also reverse the order by adding the -r option. So, if you consider the first example, the command becomes sort -r -k 2 Data.txt and the output will be:
Joe 41 Oregon Phil 35 Netherlands Anna 24 Italy Kate 16 UK
If you want to redirect the output on another text file instead of the standard output, you can use the -o option. In this case the command is something like this: sort -r -k 2 Data.txt -o DataSorted.txt.
Now, what if you have the same text file with an heading?
NAME AGE COUNTRY Anna 24 Italy Joe 41 Oregon Kate 16 UK Phil 35 Netherlands
In this case you must find a way to exclude the heading from the sort operation. Let’s consider the first example of this post.
A simple way to obtain this, is to use the following two commands:
head -n 1 Data.txt | cat > DataSorted.txt && more +2 Data.txt | sort -k 2 | cat >> DataSorted.txt
The first one extracts the first row (the heading) from Data.txt and writes it on a new file called DataSorted.txt.
The second one takes all the rows of Data.txt starting from the second one, then gives them in input to sort. Finally the sorted output is appended to DataSorted.txt.
The | operator is a pipe. It takes the output of a command and gives it as input to another one.
The && operator executes the second command (on the right side) only if the previous one succeeded.
The expected result will be:
NAME AGE COUNTRY Kate 16 UK Anna 24 Italy Phil 35 Netherlands Joe 41 Oregon
As suggested by a friend, you can also use tail in place of more:
head -n 1 Data.txt | cat > DataSorted.txt && tail -n +2 Data.txt | sort -k 2 | cat >> DataSorted.txt
For more details see the man pages of the cat, head, more and tail UNIX commands.