Chapter 4 XML Processing
185
DOM may have to load the entire document into memory so that the document
can be edited or data retrieved, whereas SAX allows the document to be processed
as it is parsed. However, despite its initial slowness, it is better to use the DOM
model when the source document must be edited or processed multiple times.
You should also try to use JAXB whenever the document content has a direct
representation, as domain specific objects, in Java. If you don't use JAXB, then
you must manually map document content to domain specific objects, and this
process often (when SAX is too cumbersome to apply see page 166) requires an
intermediate DOM representation of the document. Not only is this intermediate
DOM representation transient, it consumes memory resources and must be tra
versed when mapping to the domain specific objects. With JAXB, you can auto
matically generate the same code, thus saving development time, and, depending
on the JAXB implementation, it may not create an intermediate DOM representa
tion of the source document. In any case, JAXB uses less memory resources as a
JAXB content tree is by nature smaller than an equivalent DOM tree.
When using higher level technologies such as XSLT, keep in mind that they
may rely on lower level technologies like SAX and DOM, which may affect per
formance, possibly adversely.
When building complex XML transformation pipelines, use the JAXP class
SAXTransformerFactory
to process the results of one style sheet transformation
with another style sheet. You can optimize performance by avoiding the creation
of in memory data structures such as DOM trees by working with SAX events
until at the last stage in the pipeline.
As an alternative, you may consider using APIs other than the four discussed
previously. JDOM and dom4j are particularly appropriate for applications that
implement a document centric processing model and that must manipulate a
DOM representation of the documents.
JDOM, for example, achieves the same results as DOM but, because it is
more generic, it can address any document model. Not only is it optimized for
Java, but developers find JDOM easy to use because it relies on the Java
Collection
API. JDOM documents can be built directly from, and converted to,
SAX events and DOM trees, allowing JDOM to be seamlessly integrated in XML
processing pipelines and in particular as the source or result of XSLT transforma
tions.
Another alternative API is dom4j, which is similar to JDOM. In addition to
supporting tree style processing, the dom4j API has built in support for Xpath.
For example, the
org.dom4j.Node
interface defines methods to select nodes
according to an Xpath expression. dom4j also implements an event based pro
New Page 1