Command line quick tips: Cutting content out of files

The Fedora distribution is a full featured operating system with an excellent graphical desktop environment. A user can point and click their way through just about any typical task easily. All of this wonderful ease of use masks the details of a powerful command line under the hood. This article is part of a series that shows you some common command line utilities. So let’s drop into the shell, and have a look at cut.

Often when you work in the command line, you are working with text files. Sometimes these files may be quite long. Reading them in their entirety, while feasible, can be time consuming and prone to errors. In this installment you’ll learn how to extract content from text files, and get the information you want from them.

It’s important to recognize that there are many ways to accomplish similar command line tasks in Fedora. The Fedora repositories include entire language systems for parsing and working with text, as an example. Also, there are multiple command line utilities available for just about any purpose conceivable in the shell. This article will only focus on using a few of those utility choices, to extract some information from a file and present it in a readable format.

Making the cut

To illustrate this example use a standard sizable file on the system like /etc/passwd. As seen in a prior article in this series, you can execute the cat command to view an entire file:

$ cat /etc/passwd
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
...

This file contains information on all accounts present on the system. It has a specific format:

name:password:user-id:group-id:comment:home-directory:shell

Imagine that you want to simply have a list of all the account names on the system. If you could only cut out the name value from each line. This is where the cut command comes in handy! This command treats any input one line at a time, and extracts a specific part of the line.

The cut command provides options for selecting parts of a line differently, and in this example two of them are needed, -d which is an option to specify a delimiter type to use, and -f which is an option to specify which field of the line to cut. The -d option lets you declare the delimiter that separates values in a line. In this case a colon (:) is used to separate values. The -f option lets you choose which field value or values to extract. So for this example the command entered would be:

$ cut -d: -f1 /etc/passwd
root
bin
daemon
adm
...

That’s great, it worked! But you get the printout to the standard output, which in a terminal session at least means the screen. What if you needed the information for another task to be done later? It would be really nice if there was a way to put the output of the cut command into a text file to save it. There is an easy builtin shell function for such a task, the redirect function (>).

$ cut -d: -f1 /etc/passwd > names.txt

This will place the output of cut into a file called names.txt and you can check the contents with cat:

$ cat names.txt
root
bin
daemon
adm
...

With two commands and one shell function, it was easy to identify using cat, extract using cut, and redirect the extracted information from one file, saving it to another file for later use.


Photo by Joel Mbugua on Unsplash.

Using Software

8 Comments

  1. Mark

    Cut is great for what it does. You will quickly run into situations with multiple delimiters and inconsistent lengths that awk is better suited to handle.

    • Yeah, I was going to talk a bit about awk and gawk since they are much better at parsing more complex files, but the idea (which was started by Paul Frields) is a series of articles about useful command line tools. So I would suspect awk and gawk to appear in one at some point.

  2. Leslie Satenstein

    I generally do cut -d: f1 /etc/passwd | sort > names.txt

    Because I like my list in alphabetical order.

    • That is the beauty of the command line tools available on a Linux system normally OOTB. They are flexible and combined with piping, redirects and other tools offer capabilities that super-cede their stand alone do one thing and do it well.

      • I remember an SQL implementation a while back that compiled queries to shell scripts that did grep, cut, join, sort. The data was in plain text files. It was faster than database implementations for up to 1000s of records. Obviously, it is not going to scale – but so many applications are within the golden range for plain text files.

  3. Mark Senn

    You may want to write about Perl (I especially recommend Perl 6) instead of awk,
    or at least mention Perl in your awk article.

  4. rapra

    Nice article. I have used ‘cut’ for years, but learnt awk a couple of years ago and was blown away by its awesome power.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

The opinions expressed on this website are those of each author, not of the author's employer or of Red Hat. Fedora Magazine aspires to publish all content under a Creative Commons license but may not be able to do so in all cases. You are responsible for ensuring that you have the necessary permission to reuse any work on this site. The Fedora logo is a trademark of Red Hat, Inc. Terms and Conditions

%d bloggers like this: