HTML/XHTML FAQ

The Flask documentation and example applications are using HTML5. Youmay notice that in many situations, when end tags are optional they arenot used, so that the HTML is cleaner and faster to load. Because thereis much confusion about HTML and XHTML among developers, this document triesto answer some of the major questions.

History of XHTML

For a while, it appeared that HTML was about to be replaced by XHTML.However, barely any websites on the Internet are actual XHTML (which isHTML processed using XML rules). There are a couple of major reasonswhy this is the case. One of them is Internet Explorer’s lack of properXHTML support. The XHTML spec states that XHTML must be served with the MIMEtype application/xhtml+xml, but Internet Explorer refusesto read files with that MIME type.While it is relatively easy to configure Web servers to serve XHTML properly,few people do. This is likely because properly using XHTML can be quitepainful.

One of the most important causes of pain is XML’s draconian (strict andruthless) error handling. When an XML parsing error is encountered,the browser is supposed to show the user an ugly error message, insteadof attempting to recover from the error and display what it can. Most ofthe (X)HTML generation on the web is based on non-XML template engines(such as Jinja, the one used in Flask) which do not protect you fromaccidentally creating invalid XHTML. There are XML based template engines,such as Kid and the popular Genshi, but they often come with a largerruntime overhead and are not as straightforward to use because they haveto obey XML rules.

The majority of users, however, assumed they were properly using XHTML.They wrote an XHTML doctype at the top of the document and self-closed allthe necessary tags (<br> becomes <br/> or <br></br> in XHTML).However, even if the document properly validates as XHTML, what reallydetermines XHTML/HTML processing in browsers is the MIME type, which assaid before is often not set properly. So the valid XHTML was being treatedas invalid HTML.

XHTML also changed the way JavaScript is used. To properly work with XHTML,programmers have to use the namespaced DOM interface with the XHTMLnamespace to query for HTML elements.

History of HTML5

Development of the HTML5 specification was started in 2004 under the name“Web Applications 1.0” by the Web Hypertext Application Technology WorkingGroup, or WHATWG (which was formed by the major browser vendors Apple,Mozilla, and Opera) with the goal of writing a new and improved HTMLspecification, based on existing browser behavior instead of unrealisticand backwards-incompatible specifications.

For example, in HTML4 <title/Hello/ theoretically parses exactly thesame as <title>Hello</title>. However, since people were usingXHTML-like tags along the lines of <link />, browser vendors implementedthe XHTML syntax over the syntax defined by the specification.

In 2007, the specification was adopted as the basis of a new HTMLspecification under the umbrella of the W3C, known as HTML5. Currently,it appears that XHTML is losing traction, as the XHTML 2 working group hasbeen disbanded and HTML5 is being implemented by all major browser vendors.

HTML versus XHTML

The following table gives you a quick overview of features available inHTML 4.01, XHTML 1.1 and HTML5. (XHTML 1.0 is not included, as it wassuperseded by XHTML 1.1 and the barely-used XHTML5.)

HTML4.01XHTML1.1HTML5
<tag/value/ == <tag>value</tag>Yes1NoNo
<br/> supportedNoYesYes2
<script/> supportedNoYesNo
should be served as text/htmlYesNo3Yes
should be served asapplication/xhtml+xmlNoYesNo
strict error handlingNoYesNo
inline SVGNoYesYes
inline MathMLNoYesYes
<video> tagNoNoYes
<audio> tagNoNoYes
New semantic tags like <article>NoNoYes
  • 1
  • This is an obscure feature inherited from SGML. It is usually notsupported by browsers, for reasons detailed above.

  • 2

  • This is for compatibility with server code that generates XHTML fortags such as <br>. It should not be used in new code.

  • 3

  • XHTML 1.0 is the last XHTML standard that allows to be servedas text/html for backwards compatibility reasons.

What does “strict” mean?

HTML5 has strictly defined parsing rules, but it also specifies exactlyhow a browser should react to parsing errors - unlike XHTML, which simplystates parsing should abort. Some people are confused by apparentlyinvalid syntax that still generates the expected results (for example,missing end tags or unquoted attribute values).

Some of these work because of the lenient error handling most browsers usewhen they encounter a markup error, others are actually specified. Thefollowing constructs are optional in HTML5 by standard, but have to besupported by browsers:

  • Wrapping the document in an <html> tag

  • Wrapping header elements in <head> or the body elements in<body>

  • Closing the <p>, <li>, <dt>, <dd>, <tr>,<td>, <th>, <tbody>, <thead>, or <tfoot> tags.

  • Quoting attributes, so long as they contain no whitespace orspecial characters (like <, >, ', or ").

  • Requiring boolean attributes to have a value.

This means the following page in HTML5 is perfectly valid:

  1. <!doctype html>
  2. <title>Hello HTML5</title>
  3. <div class=header>
  4. <h1>Hello HTML5</h1>
  5. <p class=tagline>HTML5 is awesome
  6. </div>
  7. <ul class=nav>
  8. <li><a href=/index>Index</a>
  9. <li><a href=/downloads>Downloads</a>
  10. <li><a href=/about>About</a>
  11. </ul>
  12. <div class=body>
  13. <h2>HTML5 is probably the future</h2>
  14. <p>
  15. There might be some other things around but in terms of
  16. browser vendor support, HTML5 is hard to beat.
  17. <dl>
  18. <dt>Key 1
  19. <dd>Value 1
  20. <dt>Key 2
  21. <dd>Value 2
  22. </dl>
  23. </div>

New technologies in HTML5

HTML5 adds many new features that make Web applications easier to writeand to use.

  • The <audio> and <video> tags provide a way to embed audio andvideo without complicated add-ons like QuickTime or Flash.

  • Semantic elements like <article>, <header>, <nav>, and<time> that make content easier to understand.

  • The <canvas> tag, which supports a powerful drawing API, reducingthe need for server-generated images to present data graphically.

  • New form control types like <input type="date"> that allow useragents to make entering and validating values easier.

  • Advanced JavaScript APIs like Web Storage, Web Workers, Web Sockets,geolocation, and offline applications.

Many other features have been added, as well. A good guide to new featuresin HTML5 is Mark Pilgrim’s soon-to-be-published book, Dive Into HTML5.Not all of them are supported in browsers yet, however, so use caution.

What should be used?

Currently, the answer is HTML5. There are very few reasons to use XHTMLconsidering the latest developments in Web browsers. To summarize thereasons given above:

  • Internet Explorer (which, sadly, currently leads in market share)has poor support for XHTML.

  • Many JavaScript libraries also do not support XHTML, due to the morecomplicated namespacing API it requires.

  • HTML5 adds several new features, including semantic tags and thelong-awaited <audio> and <video> tags.

  • It has the support of most browser vendors behind it.

  • It is much easier to write, and more compact.

For most applications, it is undoubtedly better to use HTML5 than XHTML.