Markup

This document outlines the way we write markup and why. See our house style document to understand the rationale behind enforcement of style.

HTML vs XHTML
Semantics
- Follow the HTML 4 outline model for heading levels
Templating language

HTML vs XHTML

HTML up to 3.2 was written with SGML syntax rules. HTML 4 could be written with either SGML syntax (HTML 4.01) or XML (XHTML 1.0). HTML 5 breaks with SGML and has two serializations, a new one called "html" (which looks like SGML but is not an application of SGML) and again XML (XHTML).

We author HTML 5

When a document is transmitted with an XML MIME type, such as application/xhtml+xml, this is intended to instruct the client to render using an XML parser, and popular contemporary clients honour this. See HTML vs XHTML on w3.org.

One of the benefits of XML is its strictness -- the flip-side of this benefit is that XML is not designed to fail safely. The internet, the web, and typical web clients are however designed to fail safely. This is a philosophical difference with practical implications.

In professional web development, one typically has to rely on 3rd-party code which is executed in our domain, but under which we have no real control (advertisements, tracking code, resources served from 3rd-party CDNs, etc). Additionally accidents happen, and broken code goes live. So we must think defensively and cannot rely on the validity of our documents. When XHTML is served with an application/xhtml+xml MIME type, any errors introduce fatal errors in the document parser and completely unusable pages are the result.

So for business purposes it is impractical to serve XHTML with anything other than a text/html MIME type. This means the client cannot leverage an XML parser and instead uses a regular HTML parser, so any XML benefits are lost to the client. Additionally it introduces cognitive dissonance (a document which is an application of XML not being treated as such), we are serving a heavier page over an HTML 5 equivalent for no client-side benefit, and we lose some HTML features with XHTML such as NOSCRIPT (which still appears in 3rd-party code).

It is not impossible to imagine a back-end markup-generation system that only outputs XHTML, but as any valid XHTML document is essentially expressible in HTML (the reverse is not true) this should be seen as a shortcoming in the markup generator, at least as far as the client is concerned.

As such it seems beneficial to choose HTML 5 over XHTML.

Validation

Do what is reasonable to ensure your documents validate.

Invalid markup results in authors relying on all clients to do what they intended, not what they instructed. Clients regularly disagree on how to interpret our instructions, yet alone our intentions, so to promote consistency of execution you must do what you can to ensure your HTML is valid.

Additionally, invalid HTML violates WCAG 2.1 Success Criterion 4.1.1 Parsing (Level A), and areas where your product falls short must be noted on your product's VPAT, if it has one. It's probably easier just to make sure your HTML is valid in the first place.

Do not omit optional end tags

While the HTML 4 & 5 specs allow some end tags to be omitted for some non-void elements, it's unreasonable to expect authors to remember which elements this applies to. Additionally this kind of inconsistency can play havok with text editors, any indentation policy, and impairs our ability to spot bugs that would otherwise offend our pattern-recognition powers.

We do this:

<ol>
    <li>foo</li>
    <li>bar</li>
</ol>

We don't do this:

<ol>
    <li>foo
    <li>bar
</ol>

Implicitly close void elements

Explicitly closing void elements (e.g. <br />) is not invalid in HTML 5. However, explicitly closing void elements is an artifact of XML serialization, not HTML 5's "html" serialization, so may introduce cognitive dissonance in authors given that we are authoring HTML 5. Additionally, explicitly closing void elements was invalid in older versions of HTML.

Conversely, it may be the case that their omission impacts authors more used to writing XML, or HTML with an XML serialization. However as far as machines are concerned, they increase page weight with zero benefit.

We do this:

<meta charset="UTF-8">

We don't do this:

<meta charset="UTF-8" />

Closing elements in embedded documents

Sometimes we need to embed content from different markup languages in HTML, such as SVG or MathML.

When embedding these markup languages in HTML, there is no need to specify an namespace on the root element as it will be provided by the HTML parser. See also the SVG specs and Using MathML.

Because the namespace is provided by the HTML parser, we still do not need to explicitly close void elements.

We do this:

<svg viewBox="0 0 100 100">
  <circle cx="50" cy="50" r="50" fill="green">
</svg>

We don't do this:

<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 100 100">
  <rect x="10" y="10" width="80" height="80" fill="green" />
</svg>

Conversely, if you were authoring for an XML document (including XHTML, EPUB) you would include the namespace and would therefore have to explicitly close void elements.

Form submit controls

In modern browsers, button type="submit" provides a superset of functionality when compared to input type=submit.

button elements are much easier to style than input elements. You can add inner HTML content (think <em>, <strong>, or even <img>), and use ::after and ::before pseudo-elements to achieve complex rendering while input only accepts a textual value attribute.

Note: Submitting is button default behaviour, but we want to "code for humans". Therefore do not ommit type="submit".

Make sure to specify name and value attributes to maximize browsers compatibility (See IE11-Only Submit Bug on Stack Overflow).

We do this:

<form>
	<button type="submit" name="myButton" value="foo">Continue</button>
</form>

We don't do this:

<form>
	<input type="submit" value="Continue">
</form>

and

We don't do this:

<form>
	<button>Continue</button>
</form>

Caution: There is a shortcoming for Internet Explorer when using the button element. Indeed it submits the text within the button element as its value in the form data rather than the value of its value attribute. This becomes problematic when using multiple submit buttons in a single form with each a different value and purpose.

Here are some suggestions to work around these:

Redesign the interface so multiple submit buttons are not required.
- (Preferred) Use a multiple pages form as recommended by GOV.UK in Structuring forms, Start with one thing per page.
- Replace them with radio buttons and a single submit button (requires an extra click from the user though);
Use a separate form for each instance, with a hidden input providing the data the submit button would normally carry. This can be a good solution when you have a simple “Delete this row” problem.
Last resort solution: Hide the value inside the name of the control. A loop over the names from the form data is needed in the business logic then. Please avoid as much as you can as this adds complexity to code and may degrade performance.

Semantics

Follow the HTML 4 outline model for heading levels

When HTML 5 was first announced, it brought with it a plethora of new semantic elements and an entirely new document outline model which leveraged them to provide meaning. This was intended to supplant the HTML 4 model in which the heading level (h1 - h6) was used to imply the structure of the HTML document. See Using HTML sections and outlines for more information.

Unfortunately, this new document outline model was never well supported by browsers or assistive technologies. We considered sticking with the HTML 5 document outline model used in conjunction with setting ARIA attributes (role="heading" aria-level="3") but this rather defeated any benefit from using the more localised heading levels, and was also poorly supported in some key assistive technologies.

The HTML 5 document outline algorithm specification has been removed in HTML 5.1. You should continue to use the older HTML 4 style heading levels in conjunction with the new HTML 5 semantic elements. Do not use multiple H1 headings on a single page.

See the HTML specification for the current recommendation for use in all HTML 5 documents.

We do this:

<h1>Page title</h1>

<article>
	<h2>Article title</h2>
	<h3>Article subtitle</h3>
</article>

<footer>
	<h2>Footer title</h2>
</footer>

We don't do this:

<h1>Page title</h1>

<article>
	<h1>Article title</h1>
	<h2>Article subtitle</h2>
</article>

<footer>
	<h1>Footer title</h1>
</footer>

Templating language

Templates should be written using Handlebars.

Handlebars works in the browser, Node, and has been ported to the JVM-based languages we use server-side.

This enables us to share the same templates across the variety of environments we commonly use. Also Handlebars is a well-documented, stable technology with "good enough" compatibility between implementations (although handlebars helpers are an exception).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

house-style.md

house-style.md

Markup

HTML vs XHTML

We author HTML 5

Validation

Do not omit optional end tags

Implicitly close void elements

Closing elements in embedded documents

Form submit controls

Semantics

Follow the HTML 4 outline model for heading levels

Templating language

Files

house-style.md

Latest commit

History

house-style.md

File metadata and controls

Markup

HTML vs XHTML

We author HTML 5

Validation

Do not omit optional end tags

Implicitly close void elements

Closing elements in embedded documents

Form submit controls

Semantics

Follow the HTML 4 outline model for heading levels

Templating language