Other XML components
Elements and attributes are the principle XML components. However, you can insert more into an XML document, namely:
-
comments. A comment starts with <!-- and ends with
-->. Comments can be used everywhere, even outside the document element, but may not appear inside tags and inside other comments. Comments are intended for human readers. Do not write documents that depend on the contents of comments;
- processing instructions. Processing instructions contain information for applications that may read the document, not for human readers. They begin with <? and end with ?>. Following
the <? is an XML name called the target, possibly the name of
the application for which the instruction is intended. The rest of the
instruction contains text in a format appropriate for the application. They can be used everywhere, even outside the document element, but may not appear inside tags.
The most common processing instruction is xml-stylesheet, used to attach stylesheets to XML documents. Here is an example:
<?xml-stylesheet type="text/css" href="recipe.css"?>
- character data sections. Character data (CDATA) sections are blocks of character data that are treated as raw text data. That is, if markup is present is a CDATA section, then it is not treated as markup but instead as character data. A CDATA sections starts with <![CDATA[ and ends with
]]>. CDATA sections may be placed everywhere an element can be placed.
They exist for convenience of human authors, not for programs. An example follows:
<paragraph>
SVG is an XML encoding of line art. For instance, the
following encodes an ellipse and a rectangle:
</paragraph>
<![CDATA[
<svg width="12cm" height="10cm">
<ellipse rx="110" ry="130"/>
<rect x="4cm" y="1cm" width="3cm" height="6cm"/>
</svg>
]]>
- entity references. An entity in XML is a name for a piece of text. Some entities are predefined, some other might be defined by the user. There are five predefined XML entity references:
- < for the less-than sign (<);
- > for the greater-than sign (>);
- & for the ampersand (&)
- " for the straight double quotation marks (")
- ' for the straight single quotation marks (')
These characters are used in the XML syntax. Hence, predefined entities are useful when we want to write these characters with a different meaning than the one given by the XML syntax, as in the following example:
<tutorial>
<name><Caffè XML/></name>
</tutorial>
The content of the name element is the string <Caffè XML/> and not an empty element.
- character references. If a character is not accessible from the editor you are using you may write it as a character reference. A character reference gives a number of the particular Unicode character it stands for. The number should be prefixed by a # sign and can be in either decimal (e.g., њ for the character њ) or hexadecimal (e.g., њ for the same character). You may use character references in element content, attribute values and comments, but not in element and attribute names. Here is an example:
<paragraph>
The following is a Greek maxim saying: "The wise man knows himself".
</paragraph>
<maxim>
σοφός
έαυτόν
γιγνώσκει
</maxim>