rewrite-edit Understanding ‘fmt’: the trivial document formatter

The ‘fmt’ command is a useful utility to format text files so lines are the same length.

Under Unix, everything is a text file. When Unix was first created at Bell Labs in the 1960s, one way the new system was put to work was formatting technical documents such as patent applications. Over time, Unix gained more command line utilities that worked on text files to create other kinds of documents.

But Unix wasn’t just a Bell Labs project. Later, Ken Thompson took a sabbatical from Bell and taught computer science classes at the University of California at Berkeley. While teaching these courses, Thompson introduced his students to Unix. The Berkeley students then began an effort to create new tools for Unix. This was the start of the Berkeley Software Distribution, more commonly known as “BSD.”

One new command that appeared in 3BSD, the third iteration of the BSD system, was fmt. This simple program had one goal: reformate text documents to make them easier to read. fmt read lines as they were given in the source files, and broke up lines that were too long so they would fit under a “threshold” or “goal” length. Shorter lines were glued together to approach the target line length.

Such a tool was immediately useful for writing text files without having to “reflow” a paragraph while writing it. We don’t think about this in modern word processors, which automatically “wrap” what we type to the next line. But during an era when text files were edited one line at a time, such as with plain text editors like ed or vi, this “wrap” feature did not exist. A separate tool to reformat text files, to make each line in a paragraph the same length, helped create professional-looking documents. fmt was also useful in writing emails, as a final step to reformat the body text to make it look better.

Using ‘fmt’ for text files

A common use for fmt is to reformat lines in a plain text file, such as an email message, so let’s start with a simple text example called t.txt:

From: Jim Hall
Subject: Testing the fmt command

The fmt command first appeared in 3BSD as a utility to help reformat email messages before
sending them. It aims to "even out" the lengths of lines
by breaking long lines between words and glue shorter lines together. The fmt command can also
preserve mail headers when asked,
and   tries   to   keep   any   long   spaces   between   words,
where it can.

Using fmt to “reflow” the text file wraps long lines and glues together shorter ones, resulting in a better-looking paragraph that is easier to read:

$ fmt t.txt
From: Jim Hall Subject: Testing the fmt command

The fmt command first appeared in 3BSD as a utility to help reformat
email messages before sending them. It aims to "even out" the lengths
of lines by breaking long lines between words and glue shorter lines
together. The fmt command can also preserve mail headers when asked,
and   tries   to   keep   any   long   spaces   between   words,
where it can.

The original BSD fmt had a neat feature where it could recognize mail header lines and print each on its own line. With BSD fmt, use the -m option to preserve formatting on the “From:” and “Subject” lines:

$ fmt -m t.txt
From: Jim Hall
Subject: Testing the fmt command

The fmt command first appeared in 3BSD as a utility to help reformat
email messages before sending them. It aims to "even out" the lengths
of lines by breaking long lines between words and glue shorter lines
together. The fmt command can also preserve mail headers when asked,
and   tries   to   keep   any   long   spaces   between   words,
where it can.

The “target” or “goal” line length is 65 characters, with a maximum of 75 (ten characters longer than the “goal” length). To produce output with a different width, use the -w option, such as -w 55 to produce output at 55 characters wide:

$ fmt -m -w 55 t.txt
From: Jim Hall
Subject: Testing the fmt command

The fmt command first appeared in 3BSD as a utility to
help reformat email messages before sending them. It
aims to "even out" the lengths of lines by breaking
long lines between words and glue shorter lines
together. The fmt command can also preserve mail
headers when asked, and   tries   to   keep   any
long   spaces   between   words, where it can.

Note that the line with multiple spaces between words keeps the extra spacing in the output. The fmt command reads lines one at a time and only breaks them between words if the lines are too long. Any extra spaces in the middle of a line are preserved. If you don’t want the extra space, use the -s option to “squeeze” spaces together, resulting in a single space between words, and two spaces between sentences:

From: Jim Hall
Subject: Testing the fmt command

The fmt command first appeared in 3BSD as a utility to
help reformat email messages before sending them.  It
aims to "even out" the lengths of lines by breaking
long lines between words and glue shorter lines
together.  The fmt command can also preserve mail
headers when asked, and tries to keep any long spaces
between words, where it can.

BSD fmt supports other options for more specialized tasks, such as -c to center lines and -p for indented paragraphs. Explore these options in the online manual using man fmt.

Using ‘fmt’ for nroff files

In the early days of Unix systems, the idea of a “desktop word processor” such as LibreOffice didn’t yet exist. Instead, authors formatted documents on Unix using nroff for plain text output, and troff for output destined for a phototypesetter. The nroff and troff document preparation systems used formatting instructions or requests as one- or two-letter commands starting with a “.” at the beginning of a line. For example, to start a new paragraph, an author might type:

.sp
.ti 4
This is a new paragraph. The "sp" instruction says to add one line
of blank space, and the "ti 4" request adds a temporary indent of
four spaces on the first line.

More typically, writers relied on macro packages that managed the hard work of formatting pages, creating paragraphs, adding bold and italics text, and other formatting. For example, to write a new indented paragraph using the -me macro package, an author would use the .pp instruction. Bold text uses the .b macro, and italics text the .i instruction.

Because writing documents in nroff or troff was an everyday task, fmt included an option that supported reformatting nroff source files without disturbing the formatting instructions. Let’s start with this sample nroff -me file, called t.me:

.pp
The
.b fmt
program   is   a   useful   utility   to
.i reformat
a text document.
It does this by breaking up long longs and gluing together shorter
ones.
The BSD version of
.b fmt
has a neat feature where it won't disturb nroff instructions,
which start with a "." at the beginning of a line.

To “reflow” the text so it doesn’t look so untidy, just run fmt normally:

$ fmt t.me
.pp
The
.b fmt
program   is   a   useful   utility   to
.i reformat
a text document.  It does this by breaking up long longs and gluing
together shorter ones.  The BSD version of
.b fmt
has a neat feature where it won't disturb nroff instructions, which
start with a "." at the beginning of a line.

You can use the standard options to set a target line length and squeeze extra spaces together. The default action in fmt will avoid reformatting lines that look like nroff formatting requests:

$ fmt -w 55 -s t.me
.pp
The
.b fmt
program is a useful utility to
.i reformat
a text document.  It does this by breaking up long
longs and gluing together shorter ones.  The BSD
version of
.b fmt
has a neat feature where it won't disturb nroff
instructions, which start with a "." at the beginning
of a line.

This might be a problem for other text files that aren’t actually nroff or troff source files, but happen to include lines with a dot as the first character in a line. To format these files like any other text file, you can disable nroff special handling with the -n option:

$ fmt -w 55 -s -n t.me
.pp The .b fmt program is a useful utility to .i
reformat a text document.  It does this by breaking up
long longs and gluing together shorter ones.  The BSD
version of .b fmt has a neat feature where it won't
disturb nroff instructions, which start with a "." at
the beginning of a line.

Differences in GNU ‘fmt’

The GNU Project aims to create Free Software workalikes to the original Unix commands. Because fmt is a standard tool on BSD systems, GNU provided the fmt command in the “coreutils” package.

However, the GNU fmt command works a little differently. The basic feature of “split long lines and glue together shorter lines” is there, but the options differ. Let’s use the same t.txt file as above:

From: Jim Hall
Subject: Testing the fmt command

The fmt command first appeared in 3BSD as a utility to help reformat email messages before
sending them. It aims to "even out" the lengths of lines
by breaking long lines between words and glue shorter lines together. The fmt command can also
preserve mail headers when asked,
and   tries   to   keep   any   long   spaces   between   words,
where it can.

For example, the -s option will only split long lines, but does not “refill” them:

$ fmt -s t.txt 
From: Jim Hall
Subject: Testing the fmt command

The fmt command first appeared in 3BSD as a utility to help reformat
email messages before
sending them. It aims to "even out" the lengths of lines
by breaking long lines between words and glue shorter lines together. The
fmt command can also
preserve mail headers when asked,
and   tries   to   keep   any   long   spaces   between   words,
where it can.

The -w option works the same, however:

$ fmt -w 55 t.txt 
From: Jim Hall Subject: Testing the fmt command

The fmt command first appeared in 3BSD as a utility to
help reformat email messages before sending them. It
aims to "even out" the lengths of lines by breaking
long lines between words and glue shorter lines
together. The fmt command can also preserve mail
headers when asked, and   tries   to   keep   any
long   spaces   between   words, where it can.

To get the same behavior as BSD fmt -s, you need to use the -u option with GNU fmt. This forces one space between words, and two after sentences:

$ fmt -u t.txt
From: Jim Hall Subject: Testing the fmt command

The fmt command first appeared in 3BSD as a utility to help reformat
email messages before sending them. It aims to "even out" the lengths
of lines by breaking long lines between words and glue shorter lines
together. The fmt command can also preserve mail headers when asked,
and tries to keep any long spaces between words, where it can.

However, GNU fmt has one handy feature not present in the BSD fmt program; you can specify a “prefix” character with the -p option, and fmt will use that before every line in the output. This is incredibly useful if you need to reformat an email message that you need to reply to, and your email client has inserted > before each line:

$ sed -e 's/^/> /' t.txt
> From: Jim Hall
> Subject: Testing the fmt command
> 
> The fmt command first appeared in 3BSD as a utility to help reformat email messages before
> sending them. It aims to "even out" the lengths of lines
> by breaking long lines between words and glue shorter lines together. The fmt command can also
> preserve mail headers when asked,
> and   tries   to   keep   any   long   spaces   between   words,
> where it can.

And:

$ sed -e 's/^/> /' t.txt | fmt -p '>'
> From: Jim Hall Subject: Testing the fmt command
>
> The fmt command first appeared in 3BSD as a utility to help reformat
> email messages before sending them. It aims to "even out" the lengths
> of lines by breaking long lines between words and glue shorter lines
> together. The fmt command can also preserve mail headers when asked, and
> tries   to   keep   any   long   spaces   between   words, where it can.

The trivial document formatter

The fmt command is a handy tool in the technical writer’s toolkit. It can make light work of “reflowing” paragraphs in a plain text file. I regularly use fmt to reformat Markdown files. This is an excellent use case, because regular paragraphs in Markdown files are just paragraphs. However, be careful if your Markdown file includes tables, block code samples, or other formatting that should not be rearranged.