HTML5 is a format for a text file, which makes pretty stuff happen to the text when viewed in an HTML5 browser. For a complete explanation, see the full picture.

In this article I want to talk about the stuff that enriches the semantics of the text itself. I express my ideas in HTML5 and enrich the text to enhance and add meaning to my message. These elements directly enhance the HTML5 document experience at the smallest level. Some of these elements are not visibly distinct from one another by default, so I use CSS to magnify the distinction on my site or particular documents where appropriate.

I can have many expressive elements per document. As a best practice, within expressive elements I do not nest sections (nor paragraphs), lists (nor tables), silk (nor scripts), nor inputs (nor outputs). But I quite commonly find that nesting expressive elements within one another is semantically appropriate, so I do so quite commonly.

Terminology elements

Terminology elements set apart some particular kinds of terminology from the rest of the text.

a
An anchor a element allows me to add whole documents as a context clue or reference to enhance or change subjects. Setting the rel attribute can further enhance the expressive semantics of this element. Typically shown in underline or another color.
abbr
Its contents represent an abbreviation or acronym for an expanded term.
b
Its contents are special for utilitarian reasons but not to show importance, urgency, voice, or other emotion. Typically shown in bold font. Especially for showing key words (of a document abstract, in naming of products, actionable words in a text adventure, etc.).
dfn
Its contents represent the defining instance of a term. Typically shown in italic font style.
u
Its contents represent a unarticulated non-textual annotation. Typically shown as underline.

Quotational elements

Quotational elements mark text regarding other creative works.

blockquote
Its contents represent a quotation. A root sectioning element.
cite
Its contents represent the name of a creative work. Typically shown in italic font style. To quote a creative work use the q quote element instead.
q
Its contents represent a quotation shown with quotes and inline with surrounding text by default.

Computer elements

Computer elements set apart particular kinds of text as being involved in particular computer-related activities; namely coding, input, text editing, and output.

code
Its contents represent a fragment of computer code. Typically shown in monospace font. To be shown as code, HTML markup must be escaped.
kbd
Its contents represent user input typically from a keyboard, although representation for other input devices is allowed too. Typically shown in monospace font. Some special cases to consider:
  • A kbd within another kbd represents a single unit of input, such as the shift key.
  • A kbd within a samp represents input text being echoed from the system.
pre
Its contents represent preformatted text, where space (including tabs and new lines) are preserved as formatting elements. The first new line is ignored. The structure of the element follows typographic conventions. Wrapping a code element within a pre element is a way to directly copy code (non-xml) with minimal escaping.
samp
Its contents represent output from a program. Typically shown in monospace font. Some special cases to consider:
  • A samp within a kbd represents a menu item.

Notational elements

Notational elements represent technical (usually mathematical) elements.

math
Its contents represent MathML. For example:
a = 1 5 5 a + b 2
sub
Its contents represent a subscript.
sup
Its contents represent a superscript.
var
Its contents represent a variable. Typically shown in italic font style.

I am frazzled frequently by the need to express simple math as I would on paper. Take simple fractions for example. MathML, MathJAX and other libraries are beautiful, but to use these simply for the occasional fraction betrays the simplicity of purpose. I was so delighted to discover an elegant solution (link below) laid out plainly. It's called the fraction slash ⁄ which is an HTML character entity like a regular slash but has reduced (negative) typesetting on both ends. Combine this with sup and sub elements to form a pretty fraction.

Lovely bundle of bread,

My caliper measures your height of 2 1516", so get rested up. Your final training starts at 5 78".

I will remember that the frasl entity can deal the frazzle of fractions.

Ruby elements

Ruby elements support ruby annotations. Ruby for newbies:

  • Ruby assigns phonetic and/or semantic meaning to base characters.
  • The ruby text is typically shown above or below the matching base, but can be left or right of the base for vertical text.
  • Each rb ruby base element comes before the rt ruby text element.
  • The rb ruby base element and rt ruby text element can be interleaved as pairs, or else all the ruby base elements can be shown in sequence before the ruby text elements within a ruby element.
  • A rtc ruby text container element can introduce a second set of ruby text.

Ruby examples

(Kan)(ji)
Orange/or-unj/ (the color) is my favorite.
The milk company encoded an expiration date on the bottle of 10 Month  31 Day  2002 Year
ruby
Its contents represent a ruby annotation.
Ruby annotations are short runs of text presented alongside base text, primarily used in East Asian typography as a guide for pronunciation or to include other annotations. In Japanese, this form of typography is also known as furigana. Ruby text can appear on either side, and sometimes both sides, of the base text, and it is possible to control its position using CSS.
rb
Its contents represent a ruby base within a ruby annotation. Text directly included in a ruby element is implicitly considered an rb ruby base though this implied meaning is not present in the DOM.
rbc
Not part of the HTML5 standard. Its contents represent a ruby base container for a ruby annotation.
rp
Its contents represent additional fallback text to display for agents that do not understand ruby such as parenthetical marks around the ruby text.
rt
Its contents represent a ruby text for a ruby annotation. Text directly included in an rtc element is implicitly considered an rt ruby base though this implied meaning is not present in the DOM.
rtc
Its contents represent a ruby text container for a ruby annotation.

Bidirectional elements

Bidirectional elements support bidirectional text formatting.

bdi
To isolate directionality of its contents.
bdo
To specify directionality of its contents.

Tracking elements

Tracking elements expose editorial or machine readable information about its contents.

address
Its contents represent contact information for the nearest containing article or else the whole document. Do not mark any other information other than contact information of the author or editor with this element.
data
Annotates a machine readable equivalent to its contents which is specified in the value attribute.
del
Its contents represent deleted content. Typically shown in strike-through.
ins
Its contents represent inserted content.
Despite potential confusion for a link, I like to underline insertions and strike-through deletions via CSS. Strangely, I can style the color of the strike through or underline separate from the text itself.
span element within del element
span element within ins element
time
Its contents represent a human readable timestamp or duration with a machine readable equivalent in the datetime attribute.

Emotion elements

Emotion elements set apart particular text from the rest of the text. I observe that the semantics of these elements can be easily confused so more likely to be misused. Some expressive elements are indistinguishable from default styling so its best to consider enhancing styles to clarify necessary distinctions among them.

em
Convey stress emphasis of its contents. Typically shown in italic font style. But I can nest emphasis elements so styling is variable to need (per usual).
i
Its contents represent an alternate voice, language or mood from surrounding text. Typically shown in italic font style.
mark
Its contents are relevant to the current context such as highlighting a part of a quote for discussion, or for showing particular relevance to the user's current activity. Typically shown with highlight.
s
Its contents are inaccurate or no longer relevant. Typically shown with strike-through, the s element is not to be confused with the del element.
small
Its contents are side comments. Typically shown as a smaller font.
span
For other formatting, other language, or other isolation of its contents. Has no implicit meaning without attributes assigned.
strong
Its contents are important, serious, or urgent. Typically shown in bold font.

Expressive attributes

Besides these attributes, elements also may have Global Attributes and Event Attributes.

cite
May apply to q and blockquote elements. A URL to the source being quoted.
May apply to ins and del elements. A URL to more information about the edit.
class
Global attribute that may apply specially to code elements. May specify the code language, such as marking C++ code with class=code-cpp.
Global attribute that may apply specially to i elements. May specify another mood or voice being used, such as marking a dream sequence with class=dream-sequence.
data-abc-xyz
Global attributes that may apply to any HTML5 element. Any attribute beginning with data- is called a custom attribute which can be attached to any element as non-visible data. A custom attribute name data-abc-xyz is converted to camel-case so the data can be accessed in javascript via element.dataset.abcXyz.
datetime
May apply to ins, del and time elements. Machine readable date, time and/or timezone for the element. If missing, the default is the element.textContent instead. For example, a fully qualified date, time and timezone may be specified.
  • <time>2015-06-04T13:40:28.444Z</time>
but each portion is also acceptable:
  • <time>2015-06-04</time>
  • <time>13:40:28.444</time>
  • <time>Z</time>
dir
Global attribute that may apply specially to bdi and bdo elements. Sets the direction of text.
lang
Global attribute that may apply specially to bdi and bdo elements. If text direction is dir=auto, the language can be a clue of directionality rendering.
Global attribute that may apply specially to i elements to mark another language being used.
title
Global attribute that may apply specially to abbr elements to specify the long form of the abbreviation. Not inherited from the parent.
Global attribute that may apply specially to dfn elements to specify the exact term being defined. If missing and the only child element is an abbr element with a title attribute, then the default is that attribute value. Otherwise, the default is the dfnElement.textContent. Not inherited from the parent.
value
May apply to data elements. Machine readable equivalent of the element.

YES!! I have survived this exposé on how to express in HTML5 documents, and I hope you have too.