The importance of semantics

It's important to keep semantics in mind for your technical writing, to help users and systems interpret meaning.

January 10, 2024

A classic conundrum in technical writing is how to set special terms about technology apart from normal everyday language, or whether it's even important to do that. When you write about a programming function called "read," for example, it's easy for a reader to get confused about what you mean, just as you may have been confused by my use of "read" (the function name) and "reader" (the theoretical person looking at your writing) in this very sentence.

To call attention to special tech terms, technical writers often use bold and italic and code styles in documents, and while that does make terms look different from the everyday language of a document, it's purely visual and actually not very descriptive. This is why many technical writers introduce semantics into their documentation.

What are semantics in technical writing

The word "semantics" refers to the meaning of a word, but it's unique from just a definition like you'd find in a dictionary. When you write documentation and provide a classification for a specific term, you're defining the semantics of that word.

This sentence is not written with semantics in mind:

It's time to open a terminal.

To an employee at a train station or airport, that sentence is an instruction to invite travelers to a departure platform. To an electrician, the same sentence implies that a circuit needs to be extended. To a systems administrator, the sentence is an instruction to open an application.

There are probably other meanings, too. Here's the sentence written with semantics in mind:

It's time to open the terminal application.

And here's the sentence written in Docbook XML:

<para>
 It's time to open the <application>terminal</application> application.
</para>

Semantics for context

Context does a lot to help with semantics. Had my example sentence been preceded by text about computer networks, it would have been clear that I was referring to software. A paragraph discussing airport opening times, however, would have hinted that the sentence was actually about airline departures. That's usually good enough for a human reader, but it's not optimal for computer systems.

When you use a markup language like Docbook or HTML5 or a markdown format like Asciidoc and Markdown, you gain context markers that make context clear. For example, suppose you're writing a tutorial requiring your reader to enter commands into a computer and to verify that the output is correct. The difference between input and output may or may not be obvious to a human, but to a computer parsing your documentation there's no difference between input that looks like this:

kubectl get namespace

And output that looks like this:

NAME              STATUS   AGE
default           Active   19d
myapp             Active   1d

With a good markup language, however, you can specify what each element is for the computer:

<programlisting>
kubectl get namespace
</programlisting>
<screen><computeroutput>
NAME              STATUS   AGE
default           Active   19d
myapp             Active   1d
</computeroutput></screen>

That's a lot of metadata that your reader never has to see, but that can be invaluable to a computer.

Semantics for clarity

A graphical user interface (GUI) is the default for most modern computer systems, but as any tech support worker knows, they can be surprisingly difficult to describe. An innocent statement like "Close the window" can confuse a new user, because each user's computer has a near-infinite potential state. If a user has opened more than one window, even by mistake, during a given task, then a seemingly obvious instruction can become unexpectedly vague.

Semantic markup helps solve this problem by providing the context a computer needs to be able to apply visual (and auditory, for screen readers) styles to different interface elements described in documentation. Here's a description of a task without semantic markup:

In the Install Wizard, navigate to Install and click Proceed.
Click the OK button to confirm.

Here's the same text with semantic markup:

<procedure><title>Installation</title>
 <step>
   In the <guilabel>Install Wizard</guilabel>, navigate to
   <guimenu>
     <guimenuitem>Install</guimenuitem>
     <guimenuitem>Proceed</guimenuitem>
   </guimenu>.
 </step>
 <step>
   Click the <guibutton>OK</guibutton> to confirm.
 </step>
</procedure>

It reads almost like code, but that's because it is. With the information added by semantic markup, this text can be displayed using a unique style to differentiate a window label from a menu, button, icon, link, and so on.

How to use semantic markup

I've used Docbook to demonstrate semantic markup in this article. It's one of the most verbose and highly-specific markup languages I know, but it's not the only markup language to provide semantic data. Docbook is designed for technical writing about computers, but there are other XML schemata that might be better suited to your industry.

Other markup languages, like HTML5, Markdown, and Asciidoc offer a sometimes sensible middle ground. Sacrificing specificity for simplicity means you must reduce lots of metadata down to just one markup tag. For instance, you might declare that bold text denotes buttons, window labels, and tab names, while italic denotes menu names and menu items, and a monospace code font denotes the names of applications and code listings. You replace explicitness with strictly consistent style, which is better than nothing at all.

Semantics for the future

Technology marches ever onward, and the truth is that we don't know what computing next year is going to look like. Nobody ever thought to invent a semantic tag for a telephone number because nobody imagined that you'd ever need to click on one to pass it to a telephone app. When cell phones became commonplace, it suddenly mattered that a phone number on a website was a phone number, and not just a string of numbers.

It's impossible to predict all the kinds of semantics we'll need in future technologies. However, it's important to keep semantics in mind as you write, because you never know what technology might be able to make use of your metadata.

Seth Kenlon is a Unix and Linux geek, open source enthusiast, and tabletop gamer. Between gigs in the film industry and the tech industry (not necessarily exclusive of one another) he likes to design games and hack on Java or Lua code (also not necessarily exclusive of one another).