The Fedora distribution is a full featured operating system with an excellent graphical desktop environment. A user can point and click their way through just about any typical task easily. All of this wonderful ease of use masks the details of a powerful command line under the hood. This article is part of a series that shows you some common command line utilities. So let’s drop into the shell, and have a look at cut.
Often when you work in the command line, you are working with text files. Sometimes these files may be quite long. Reading them in their entirety, while feasible, can be time consuming and prone to errors. In this installment you’ll learn how to extract content from text files, and get the information you want from them.
It’s important to recognize that there are many ways to accomplish similar command line tasks in Fedora. The Fedora repositories include entire language systems for parsing and working with text, as an example. Also, there are multiple command line utilities available for just about any purpose conceivable in the shell. This article will only focus on using a few of those utility choices, to extract some information from a file and present it in a readable format.
Making the cut
To illustrate this example use a standard sizable file on the system like /etc/passwd. As seen in a prior article in this series, you can execute the cat command to view an entire file:
$ cat /etc/passwd root:x:0:0:root:/root:/bin/bash bin:x:1:1:bin:/bin:/sbin/nologin daemon:x:2:2:daemon:/sbin:/sbin/nologin adm:x:3:4:adm:/var/adm:/sbin/nologin ...
This file contains information on all accounts present on the system. It has a specific format:
Imagine that you want to simply have a list of all the account names on the system. If you could only cut out the name value from each line. This is where the cut command comes in handy! This command treats any input one line at a time, and extracts a specific part of the line.
The cut command provides options for selecting parts of a line differently, and in this example two of them are needed, -d which is an option to specify a delimiter type to use, and -f which is an option to specify which field of the line to cut. The -d option lets you declare the delimiter that separates values in a line. In this case a colon (:) is used to separate values. The -f option lets you choose which field value or values to extract. So for this example the command entered would be:
$ cut -d: -f1 /etc/passwd
That’s great, it worked! But you get the printout to the standard output, which in a terminal session at least means the screen. What if you needed the information for another task to be done later? It would be really nice if there was a way to put the output of the cut command into a text file to save it. There is an easy builtin shell function for such a task, the redirect function (>).
$ cut -d: -f1 /etc/passwd > names.txt
This will place the output of cut into a file called names.txt and you can check the contents with cat:
$ cat names.txt
With two commands and one shell function, it was easy to identify using cat, extract using cut, and redirect the extracted information from one file, saving it to another file for later use.
Photo by Joel Mbugua on Unsplash.
Cut is great for what it does. You will quickly run into situations with multiple delimiters and inconsistent lengths that awk is better suited to handle.
Yeah, I was going to talk a bit about awk and gawk since they are much better at parsing more complex files, but the idea (which was started by Paul Frields) is a series of articles about useful command line tools. So I would suspect awk and gawk to appear in one at some point.
Looking forward to the awk series
I generally do cut -d: f1 /etc/passwd | sort > names.txt
Because I like my list in alphabetical order.
That is the beauty of the command line tools available on a Linux system normally OOTB. They are flexible and combined with piping, redirects and other tools offer capabilities that super-cede their stand alone do one thing and do it well.
Stuart D Gathman
I remember an SQL implementation a while back that compiled queries to shell scripts that did grep, cut, join, sort. The data was in plain text files. It was faster than database implementations for up to 1000s of records. Obviously, it is not going to scale – but so many applications are within the golden range for plain text files.
You may want to write about Perl (I especially recommend Perl 6) instead of awk,
or at least mention Perl in your awk article.
Nice article. I have used ‘cut’ for years, but learnt awk a couple of years ago and was blown away by its awesome power.