Command line quick tips: wc, sort, sed and tr

Posted by mahesh1b on July 26, 2021

Image by Ryan Lerch (CC BY-SA 4.0)

Linux distributions are great to use and they have some tricks under their sleeves which users may not be aware of. Let’s have a look at some command line utilities which really come in handy when you’re the guy that likes to stick with the terminal rather than using a GUI.

We all know that using a terminal is more efficient to use the system. In case you are editing or playing with text files on a terminal then these tools will surely make your life easy.

For this article let’s have a look at wc, sort, tr, and sed commands.

wc

wc is a utility whose name stands for “word count”. As the name suggests it will count the lines, words or byte count from any file.

Let’s see how it works:

$ wc filename
lines words characters filename

So in output we get the total number of newlines in the file, total number of words, total number of characters, and the filename.

To get some specific output we have to use options:

-c To print the byte counts
-l To print the newline counts
-w To print the word counts
-m To print the character counts

wc demo

Let’s see it in action:

Here we start with a text file, loremipsm.txt. First, we print out the file and then use wc on it.

$ cat loremipsm.txt
Linux is the best-known and most-used open source operating system.
As an operating system, Linux is software that sits underneath all of the other software on a computer,
receiving requests from those programs and replaying these requests to the computer's hardware.

$ wc loremipsm.txt
3 41 268 loremipsm.txt

Suppose I only want to see the byte count of the file:

$ wc -c loremipsm.txt
268 loremipsm.txt

For the newline count of the file:

$ wc -l loremipsm.txt
3 loremipsm.txt

To see the word count of the file:

$ wc -w loremipsm.txt
41 loremipsm.txt

Now only the character count of the file:

$ wc -m loremipsm.txt
268 loremipsm.txt

sort

The sort command is one of the most useful tools. It will sort the data in a file. Sorting is by either characters or numbers in ascending or descending order. It can also be used to sort or randomize the lines of files.

Using sort can be very simple. All we need to do is provide the name of the file.

$ sort filename

By default it sorts the data in alphabetical order. One thing to note is that the sort command just displays the sorted data. It does not overwrite the file.

Some useful options for sort:

-r To sort the lines in the file in reverse order
-R To shuffle the lines in the file into random order
-o To save the output in another file
-k To sort as per specific column
-t To mention the field separator
-n To sort the data according to numerical value

sort demo

Let’s use sort in some short demos:

We have a file, list.txt, containing names and numeric values separated by commas.

First let’s print out the file and just do simple sorting.

$ cat list.txt
Cieran Wilks, 9
Adelina Rowland, 4
Hayden Mcfarlnd, 1
Ananya Lamb, 5
Shyam Head, 2
Lauryn Fuents, 8
Kristian Felix, 10
Ruden Dyer, 3
Greyson Meyers, 6
Luther Cooke, 7

$ sort list.txt
Adelina Rowland, 4
Ananya Lamb, 5
Cieran Wilks, 9
Greyson Meyers, 6
Hayden Mcfarlnd, 1
Kristian Felix, 10
Lauryn Fuents, 8
Luther Cooke, 7
Ruden Dyer, 3
Shyam Head, 2

Now sort the data in the reverse order.

$ sort -r list.txt
Shyam Head, 2
Ruden Dyer, 3
Luther Cooke, 7
Lauryn Fuents, 8
Kristian Felix, 10
Hayden Mcfarlnd, 1
Greyson Meyers, 6
Cieran Wilks, 9
Ananya Lamb, 5
Adelina Rowland, 4

Let’s shuffle the data.

$ sort -R list.txt
Cieran Wilks, 9
Greyson Meyers, 6
Adelina Rowland, 4
Kristian Felix, 10
Luther Cooke, 7
Ruden Dyer, 3
Lauryn Fuents, 8
Hayden Mcfarlnd, 1
Ananya Lamb, 5
Shyam Head, 2

Let’s make it more complex. This time we sort the data according to the second field, which is the numeric value, and save the output in another file using the -o option.

$ sort -n -k2 -t ',' -o sorted_list.txt list.txt
$ ls 
   sorted_list.txt    list.txt
$ cat sorted_list.txt
Hayden Mcfarlnd, 1
Shyam Head, 2
Ruden Dyer, 3
Adelina Rowland, 4
Ananya Lamb, 5
Greyson Meyers, 6
Luther Cooke, 7
Lauryn Fuents, 8
Cieran Wilks, 9
Kristian Felix, 10

Here we used -n to sort in numerical order, -k to specify the field to sort (2 in this case) -t to indicate the delimiter or field-separator (a comma) and -o to save the output in the file sorted_list.txt.

sed

Sed is a stream editor that will filter and transform text in the output. This means we are not making changes in the file, only to the output. We can also save the changes in a new file if needed. Sed comes with a lot of options that are useful in filtering or editing the data.

The syntax for sed is:

$ sed [OPTION] ‘PATTERN’ filename

Some of the options used with sed:

-n : To suppress the printing
p: To print the current pattern
d : To delete the pattern
q : To quit the sed script

sed demo

Lets see sed in action. We start with the file data with the fields indicating number, name, age and operating system.

Printing the lines twice if they occur in a specific range of lines.

$ cat data
1    Vicky Grant      20   Linux
2    Nora Burton    19   Mac
3    Willis Castillo   21  Windows
4    Gilberto Mack 30   Windows
5    Aubrey Hayes  17   windows
6    Allan Snyder    21   mac
7    Freddie Dean   25   linux
8    Ralph Martin    19   linux
9    Mindy Howard  20   Mac

$ sed '3,7 p' data
1    Vicky Grant      20   Linux
2    Nora Burton    19   Mac
3    Willis Castillo   21  Windows
3    Willis Castillo   21  Windows
4    Gilberto Mack 30   Windows
4    Gilberto Mack 30   Windows
5    Aubrey Hayes  17   windows
5    Aubrey Hayes  17   windows
6    Allan Snyder    21   mac
6    Allan Snyder    21   mac
7    Freddie Dean   25   linux
7    Freddie Dean   25   linux
8    Ralph Martin    19   linux
9    Mindy Howard 20   Mac

Here the operation is specified in single quotes indicating lines 3 through 7 and using ‘p’ to print the pattern found. The default behavior of sed is to print every line after parsing it. This means lines 3 through 7 appear twice because of the ‘p’ instruction.

So how can you print specific lines from the file? Use the ‘-n’ option to eliminate lines that do not match from the output.

$ sed -n '3,7 p' data
3    Willis Castillo     21    Windows
4    Gilberto Mack    30   Windows
5    Aubrey Hayes     17   windows
6    Allan Snyder       21   mac
7    Freddie Dean      25  linux

Only lines 3 through 7 will appear using ‘-n’ .

Omitting specific lines from the file. This uses the ‘d’ to delete the lines from the output.

$ sed '3 d' data
1    Vicky Grant      20    Linux
2   Nora Burton     19    Mac
4   Gilberto Mack  30    Windows
5   Aubrey Hayes   17    windows
6   Allan Snyder     21    mac
7   Freddie Dean    25   linux
8   Ralph Martin    19    linux
9   Mindy Howard  20   Mac

$ sed '5,9 d' data
1    Vicky Grant     20   Linux
2   Nora Burton    19   Mac
3   Willis Castillo   21   Windows
4   Gilberto Mack 30   Windows

Searching for a specific keyword in the file.

$ sed -n '/linux/ p' data
7    Freddie Dean   25  linux
8    Ralph Martin   19   linux

$ sed -n '/linux/I p' data
1     Vicky Grant      20  Linux
7     Freddie Dean  25  linux
8     Ralph Martin   19  linux

In these examples we have a regular expression which appears in ‘/ /’. If we have similar words in the file but not with proper case then we use the “I” to make the search case insensitive. Recall that the -n eliminates the lines that do not match from the output.

Replacing the words in the file.

$ sed 's/linux/linus/' data
1   Vicky Grant      20   Linux
2   Nora Burton    19   Mac
3   Willis Castillo   21   Windows
4   Gilberto Mack  30  Windows
5   Aubrey Hayes   17  windows
6   Allan Snyder     21  mac
7   Freddie Dean    25 linus
8   Ralph Martin    19  linus
9   Mindy Howard 20  Mac

Here ‘s/ / /’ denotes that it is a regex. The located word and then the new word to replace it appear between the two ‘/’.

tr

The tr command will translate or delete characters. It can transform the lowercase letters to uppercase or vice versa, eliminate repeating characters, and delete specific characters.

One thing weird about tr is that it does not take files as input like wc, sort and sed do. We use “|” (the pipe symbol) to provide input to the tr command.

$ cat filename | tr [OPTION]

Some options used with tr:

-d : To delete the characters in first set of output
-s : To replace the repeated characters with single occurrence

tr demo

Now let’s use the tr command with the file letter to convert all the characters from lowercase to uppercase.

$ cat letter
Linux is too easy to learn,
And you should try it too.

$ cat letter | tr 'a-z' 'A-Z'
LINUX IS TOO EASY TO LEARN,
AND YOU SHOULD TRY IT TOO.

Here ‘a-z’ ‘A-Z’ denotes that we want to convert characters in the range from “a” to “z” from lowercase to uppercase.

Deleting the “o” character from the file.

$ cat letter | tr -d 'o'
Linux is t easy t learn,
And yu shuld try it t.

Squeezing the character “o” from the file means that if “o” is repeated in line then it will remove it and print it only once.

$ cat letter | tr -s 'o'
Linux is to easy to learn,
And you should try it to.

Conclusion

This was a quick demonstration of the wc, sort, sed and tr commands. These commands make it easy to manipulate the text files on the terminal in a quick and efficient way. You may use the man command to learn more about these commands.

Fedora Project community

mahesh1b

26 Comments

William gupton

This is great..quick and to the point

July 26, 2021
James

Ooh, can I add my favourite use of sort?

sort | uniq -c | sort -n

For use on logfiles: use grep to select the lines you’re interested in, something like cut to select the resource you’re interested in (something like a user agent, a user, or a file), and then pipe that to sort | uniq -c | sort -n. Gives you a list of how often the resource turns up in the logfile, which gives you a pointer to where to start looking for further problems.

Worked example on a mail server with Exim installed:

grep -h ” <= ” /var/log/exim/main.log* | cut -d\ -f5 | cut -d@ -f2 | sort | uniq -ic | sort -n

(all one line.)

Which domains email us the most often?

July 26, 2021
Daniel

What has happened that one needs to write a blog post to talk about basic Unix programs? Who are the targeted readers of this post?

This isn’t a good sign. If this is true that many Linux users don’t know about the shell, we are doing something wrong, the way they are introduced to the system needs to be rethinked.

July 26, 2021
- SigmaSquadron
  
  It’s a good thing, IMO. Linux has always been behind macOS and Windows in terms of GUI usage, and if there’s a number of users who solely use the GUI as their primary method of interfacing with the system, it shows that the desktop front of Linux evolved past the need to share space with the CLI.
  
  Besides, those users can always turn to the LFS manual to fully learn what makes Linux tick. It’s a good thing to learn new things on computing, but most of the time you just want a good GUI-only computer to get some work done.
  
  I guess the way Linux works makes this possible. Distros like Fedora Workstation will probably never give much thought to the CLI, while distros targeted for more advanced users, like Arch, Gentoo, LFS or Slackware will pay less attention to providing a good GUI for newcomers. It’s all about different audiences and the choice to use what you want — be that a GUI-only distro, a mix of both interfaces, or a CLI-only distro.
  
  July 26, 2021
- Darvond
  
  Remember, not everyone is a Unix Guru or Amiga veteran. Most are probably washing ashore from the rocky shoals of MInt, Ubuntu, Pop-OS and other such things which will try their damnedest to hide the “underbelly”.
  
  And given that most UNIX were coded with the constraints of punchcards and 300 baud terminals, there is a lot of “inside baseball”. especially given between aa and zz there are 676 potential combinations, and that’s not counting switches. Even something as simple as ls goes from -a to -z.
  
  Is cal Calculate? Nope. It’s calendar.
  
  Demystifying the command line is always good.
  
  July 27, 2021
- lucky thousand
  
  https://xkcd.com/1053/
  
  July 28, 2021
rex fury

Tricks UP their sleeves 🙂

July 26, 2021
rex fury

I love Linux from the command line! Screen for multiple consoles, Midnight Commander for files, No, I really don’t use Vim that much. 🙂

July 26, 2021
xenlo

there is a mistake in “wc demo” section, the last code box is the copy-paste of the previous one. It should be

wc -m

in place of

-w

July 26, 2021
- Gregory Bartholomew
  
  Thanks xenlo. I’ve made the correction.
  
  July 26, 2021
Ron Olson

wc has another awesome benefit; because of pipes, you can pipe output to wc which works great for finding out how many items are in a directory, .e.g., “ls -l | wc -l”

July 26, 2021
james miller

Nice introduction to some handy commands. I am fairly sure that all the commands can be used as streams, with pipes, certainly sed and wc can.
I am currently using :
ls |wc -l to list how many files are in a directory, for example.

July 26, 2021
- y0umu
  
  isn’t it
  
  ls -1 | wc -l
  
  ?
  
  August 2, 2021
  - Gregory Bartholomew
    
    Some programs vary their output depending on whether their output is connected to an interactive terminal or to a pipe. ls is one of them. Try it:
    
    ls | cat
    
    August 2, 2021
Jasper Hartline

When doing %post processing in rpmbuild sessions to build an RPM package on Fedora I’ve used 6 command utilities in line with pipes to get what I need its so awesome the pipe is the first resort.

July 26, 2021
You Really Don't Want to Know

perl

July 26, 2021
Sampsonf

Thanks for sharing.

For sed, when it will modify the data file, when it will modify the output only?

July 27, 2021
Wolfgang Marx

Great Overview! Thanks, learned something again:)

July 27, 2021
edier88

Excellent article.
A full tutorial of awk and sed would be nice too.
Thank you!

July 27, 2021
- Jared G
  
  I second this! Giving a “full” tutorial within a single article is surely impossible, but there are some common use cases that would be helpful to cover.
  
  Whenever performing some rote task in my everyday administrative life, I ask myself, “Can I automate this somehow?” Often, I wind up writing a shell script to just that. In my experience, this is one of the best ways to deeply learn your way around tools like these.
  
  August 3, 2021
Ok5ieGh3

The tr-example makes useless use of cat, instead of cat letter | tr one should use tr < letter

July 27, 2021
Nildo

Excelente postagem!
Thank you!

July 27, 2021
Phoenix

When demonstrating “tr” (and “sed”), I have the feeling that “cut” should have been mentioned alongside with it as it is as easy and useful as well. As with all mentioned programs, it works tremendously well capturing (piping) the output of the previous command and further refining the output.

Example:
cut -d”:” -f1,3-4,6 /etc/passwd

“-d” uses delimiter (in this case “:”)
“-f” display only fields delimited by delimiter (here: fields 1, 3 to 4 and 6)

July 27, 2021
Leslie Satenstein, Montreal,Que,Canada

This article is going to be my cheat sheet. In a few well designed examples, I have learned how to make better use of, or infact, consider using the above described command line tools.,

Thank You !!

July 29, 2021
Ian

There is an error in the sed demo.
The example file (data) has ‘linux’ in the first line.
Part way through the demonstration, that line has changed to ‘Linux’ so that the use of case insensitive can be shown

August 18, 2021
- Gregory Bartholomew
  
  Thanks Ian. I’ve attempted to correct the example file so that the first line contains ‘Linux’ from the beginning.
  
  August 18, 2021