Getting started with DITA OT

DITA is a powerful technical writing technology that makes it easier to reuse content in new ways.

March 13, 2024

If you find yourself writing new documents by copying and pasting from other documents, that's an excellent opportunity to leverage DITA for documentation. DITA stands for the Darwin Information Typing Architecture, an open document standard maintained by the OASIS working group.

DITA files use XML as markup. That means they have a single parent data block that contains the content. Technical writers can leverage three main types of DITA files:

DITA Concept, which describes a thing or process.
DITA Task, to list the steps to complete a task.
DITA Reference, to provide the key facts about something.

These documents rarely exist by themselves. Authors can use DITA to create documentation by breaking up documentation into individual topics that describe a thing. Let's say you needed to write documentation about a new Linux distribution; you might write a Concept file that describes the Linux distribution, a Task file to list the steps to install Linux on a computer, and a Reference file to list the system requirements such as how much memory or free disk space you need on your computer. Each of these DITA files is a topic that encapsulates the information. You can combine them into a finished document using a DITA Map file.

A sample DITA Concept file

Let's explore the basics of writing a DITA Concept file. DITA files are just plain text files, using XML as markup. The DITA standard requires that DITA files reference a DTD in the DOCTYPE document type declaration at the start of the file. After the declaration, DITA files have a parent data block that is named after the topic, such as <concept> for DITA Concept files. This parent block contains three main data blocks: a title, short description, and some kind of body. For example, DITA Concept files use <conbody> for the body.

A sample DITA Concept file to describe a made-up Linux distribution called "TW Linux" might look like this:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
<concept id="twlinux">
    <title>TW Linux</title>
    <shortdesc>TW Linux is a Linux distribution for technical writers.</shortdesc>
    <conbody>
        <p>TW Linux is a live Linux distribution designed for technical writers, and comes packed
            with tools and technologies for technical writing. TW Linux has a friendly and simple
            graphical user interface that makes it easy to start writing in Docbook, Markdown,
            LaTeX, DITA, HTML, and more using a variety of powerful desktop tools.</p>
    </conbody>
</concept>

The xml declaration on the first line is not strictly required for DITA files, but recommended.

All DITA files must have a unique identifier listed as the id= attribute of the parent data block. Inside this parent block, the DITA file contains <title> with the document's title, <shortdesc> to provide a one-line description of the document, and <conbody> with the contents of the article. The <p> tag starts a new paragraph within the body.

Transforming the file with DITA OT

You don't need to purchase an expensive tool to process DITA files; you can use the open source DITA Open Toolkit ("DITA OT"). This is a command line processor written in Java. You will also need to set JAVA_HOME pointed to a working Java on your system to run DITA OT. On my system, I have a copy of Java installed in /opt/java/jre-17.0.8 so I set my JAVA_HOME like this:

JAVA_HOME=/opt/java/jre-17.0.8

DITA OT provides a command line tool called dita that transforms DITA files into different kinds of output. In its most basic usage, you need to specify the input file and the output format, such as PDF or HTML:

$ dita -i file -f format

Or:

$ dita --input=file --format=format

DITA OT supports several output types, including html5, markdown, and pdf. Use the transtypes subcommand to get a list of all supported transformations:

$ dita transtypes
xhtml
htmlhelp
pdf
pdf2
eclipsehelp
html5
dita
markdown
markdown_github
markdown_gitbook

For example, if you saved the sample DITA Concept file as about.dita, you can process it into a PDF file like this:

$ dita -i about.dita -f pdf

If the DITA file doesn't contain any errors, DITA OT will create a PDF file in a new directory called out. To set a different output directory, use the -d option, like this command to generate the output in the same directory (the . directory) where you ran the command:

$ dita -i about.dita -f pdf -o .

screenshot of DITA Concept file transformed to PDF — The transformation creates a PDF file

DITA as a powerful documentation tool

DITA can be a powerful tool for technical writers. It excels for writing tasks where you need to reuse and remix content to create new types of documents. Using XML as markup, DITA provides a flexible digital writing markup for a variety of documents.

Jim Hall is an open source software advocate and technical writer. At work, Jim is CEO of Hallmentum, an IT executive consulting company that provides hands-on IT Leadership training, workshops, and coaching.