124
XML Overview
The handling of the following XML document concepts may have a signifi
cant impact on the design and performance of an XML based application:
Well formedness
An XML document needs to be well formed to be parsed.
A well formed XML document conforms to XML syntax rules and constraints,
such as:
I
The document must contain exactly one root element, and all other elements
are children of this root element.
I
All markup tags must be balanced; that is, each element must have a start and
an end tag.
I
Elements may be nested but they must not overlap.
I
All attribute values must be in quotes.
Validity
According to the XML specification, an XML document is consid
ered valid if it has an associated DTD declaration and it complies with the con
straints expressed in the DTD. To be valid, an XML document must meet the
following criteria:
I
Be well formed
I
Refer to an accessible DTD based schema using a Document Type Declara
tion:
I
Conform to the referenced DTD
With the emergence of new schema languages, the notion of validity is extend
ed beyond the initial specification to other, non DTD based schema languages,
such as XSD. For these non DTD schemas, the XML document may not refer
explicitly to the schema, though it may only contain a hint to the schema to
which it conforms. The application is responsible for enabling the validation of
the document. Regardless of any hints, an application may still forcefully val
idate this document against a particular schema. (See Validating XML Docu
ments on page 139.)
Logical and physical forms
An XML document has one logical form that
may be laid out potentially in numerous physical forms. The physical form (or
forms) represent the document's storage layout. The physical form consists of
storage units called entities, which contain either parsed or unparsed data.
Parsed entities are invoked by name using entity references. When parsed, the
reference is replaced by the contents of the entity, and this replacement text be
New Page 1