References to, and specifications of CDATA can be seen all throughout the W3C Recommendations– especially in Standardized Generalized Markup Language, the Markup Language from which HTML itself is derived (SGML is a descriptive markup for the structure of a computer document – therefore, since HTML is itself a Structural Markup, we can conclude that HTML itself is a form of SGML) and XML.
the following definition is one of the best i’ve found for CDATA. it really seems to break it down into a simple explanation.
Only text inside a CDATA section will be ignored by the parser.
Parsed Data
XML parsers normally parse all the text in an XML document.
When an XML element is parsed, the text between the XML tags is also parsed:
<message>This text is also parsed</message>
The parser does this because XML elements can contain other elements, as in this example, where the <name> element contains two other elements (first and last):
<name><first>Bill</first><last>Gates</last></name>
and the parser will break it up into sub-elements like this:
<name>
<first>Bill</first>
<last>Gates</last>
</name>
Escape Characters
Illegal XML characters have to be replaced by entity references.
If you place a character like "<"
inside an XML element, it will generate an error because the parser interprets it as the start of a new element. You cannot write something like this:
<message>if salary < 1000 then</message>
To avoid this, you have to replace the "<" character with an entity reference, like this:
<message>if salary < 1000 then</message>
There are 5 predefined entity references in XML:
< | < | less than |
> | > | greater than |
& | & | ampersand |
' | ‘ | apostrophe |
" | " | quotation mark |
Note: Only the characters "<" and "&" are strictly illegal in XML. Apostrophes, quotation marks and greater than signs are legal, but it is a good habit to replace them.
CDATA
Everything inside a CDATA section is ignored by the parser.
If your text contains a lot of "<" or "&"
characters – as program code often does – the XML element can be defined as a CDATA section.
A CDATA section starts with "<![CDATA["
and ends with "]]>"
:
<script>
<![CDATA[
function matchwo(a,b)
{
if (a < b && a < 0) then
{
return 1
}
else
{
return 0
}
}
]]>
</script>
In the example above, everything inside the CDATA section is ignored by the parser.
Notes on CDATA sections:
A CDATA section cannot contain the string "]]>"
, therefore, nested CDATA sections are not allowed.
Also make sure there are no spaces or line breaks inside the "]]>"
string.
an excerpt from:
XML CDATA: the W3 Schools
http://www.w3schools.com/xml/xml_cdata.asp
I also recommend the definition provided in the Wikipedia, although we must remember that the Wikipedia is not reviewed by professional editors either, it does have a thorough definitoin– see the citation below for the URL to the resource.
CDATA. (2006, March 11). In Wikipedia, The Free Encyclopedia. Retrieved 10:33, July 29, 2006, from http://en.wikipedia.org/w/index.php?title=CDATA&oldid=43227290.
in any event, i’m happy that you find this journal of notes to be a useful tool in your own learning. i welcome your feedback and your input. thanks!
Leave a Reply