Half Notes

eXtensible Markup Language

XML (Extensible Markup Language) is a W3C standard for text document markup.

I always skip the chapter on XML. This entry is an attempt to be able to continue to do this, yet still be able to grasp why things are done the way they are in the world of web document authoring.

A Definition

eXtensible Markup Language or XML is not a language (like HTML) but rather a set of rules for creating other markup languages. This makes it a a metalanguage–a language for describing other languages–which lets you design your own markup languages for different type of documents, and gives some insight as to the “eXtensible” aspect of its name. XML can do this because it’s written in SGML, the international standard metalanguage for text document markup (ISO 8879).

See also

Elements and Structure

The most significant thing about XML is that it offers semantic markup and document structure using elements such as: <dog>Lassie</dog>. The tags <dog> and </dog> add meaning to Lassie for humans and machines alike. Elements can contain other elements, which contain yet more elements, and together give a document create its structure:


<?xml version="1.0"?>
<movie>
  <title>Lassie Come Home</title>
  <year>1943</year>
  <plot>Hard times came for Carraclough family and they are forced to sell Lassie to the rich Duke of Rudling.</plot>
  <cast>
    <human>Roddy McDowall</human>
    <dog>Lassie</dog>
    <!--more movies added hear -->
  </cast>
<!--more movies added hear -->
</movie>

Of note is that this representation is both text and data, and so can be stored in a database or in plain-text. This means that XML documents are not tied to a proprietary format or device that may become obsolete and can be easily shared between incompatible systems.

Also of note is that XML documents may be used for all sorts of content, not just Lassie movies. Some XML languages use a Document Type Definition (DTD) that defines which elements may be used in the document.

See Also

  • W3C (2000). XML Schema. XML Schemas offer a method for defining XML elements and document structure.
  • W3C (2006). The Extensible Stylesheet Language Family (XSL). Markup languages describe structure, not the presentation of a document. Like HTML, XML documents can use Cascading Style sheets for presentation (fast and preferable) or Extensible Stylesheet Language (slow but sometimes necessary).

Well-Formedness

This is an important distinction to make before rushing to validity: An XML document must be well-formed, and should be valid, but validity is not essential.

Well-formed documents comply with the XML rules for marking up a document, regardless of specific language. For example, all elements muct be correctly nested and may not overlap. Valid documents are both well-formed and comply with the rules set for a particular XML language. So, in XHTML is is invalid to put body element inside a link element, even if it is perfectly nested.

Of note to authors: browsers may still be able to render sloppy, error-ridden HTML, but they cannot do so with XML documents

See Also

  • Sall, K. (2000). XML Software Guide: XML Parsers. There are hundreds of explicit criteria for creating well-formed XML documents, many of them common sense. It is always a good idea to check the syntax of your document using one of the well-formedness checker listed at the Web Developer’s Virtual Library.
  • Eisenberg, J.D. (2001). How to Read W3C Specs. Learning to read a DTD (they begin with <!DOCTYPE ...>) is not easy, but worthwhile if you spend anytime authoring XML documents because it is the ultimate authority for what is and is not syntactically correct for a particular markup language. He also talks about namespaces, which allows you to use elements from differnt XML applications in the same document.

XML on the Web

This is a list of the XML languages that are relevant to the Web. For now, they are just placeholders; but I want to delve into some of these in more detail at some point.

To Read

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

Subscribe without commenting