FACTOID # 53: If you thought Antarctica was inhospitable, think again - its land area is only ninety-eight percent ice. Reassuringly, the other 2% is categorised as "barren rock".
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

SEARCH ALL

FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > Simple API for XML

The Simple API for XML (SAX) is a serial access parser API for XML. SAX provides a mechanism for reading data from an XML document. It is a popular alternative to the Document Object Model (DOM). In telecommunication, the term serial access has the following meanings: 1. ... A parser is a computer program or a component of a program that analyses the grammatical structure of an input, with respect to a given formal grammar, a process known as parsing. ... API and Api redirect here. ... The Extensible Markup Language (XML) is a general-purpose markup language. ... // Hierarchy of Objects in the DOM - Document Object Model The Document Object Model (DOM) is a platform- and language-independent standard object model for representing HTML or XML and related formats. ...

Contents

XML Processing with SAX

A parser which implements SAX (ie, a SAX Parser) functions as a stream parser, with an event-driven API. The user defines a number of callback methods that will be called when events occur during parsing. The SAX events include: Event driven is a term applied to a software design model which is distinct from the command driven model, and which is especially suited to real time systems and highly interactive systems. ... A callback is often back on the level of the original caller. ...

  • XML Text nodes
  • XML Element nodes
  • XML Processing Instructions
  • XML Comments

Events are fired when each of these XML features are encountered, and again when the end of them is encountered. XML attributes are provided as part of the data passed to element events.


SAX parsing is unidirectional; previously parsed data cannot be re-read without starting the parsing operation again.


Example

Given the following XML document:

 <?xml version="1.0" encoding="UTF-8"?> <RootElement param="value"> <FirstElement> Some Text </FirstElement> <SecondElement param2="something"> Pre-Text <Inline>Inlined text</Inline> Post-text. </SecondElement> </RootElement>  

This XML document, when passed through a SAX parser, will generate the following sequence of events:

  • XML Processing Instruction, named xml, with attributes version equal to "1.0" and encoding equal to "UTF-8".
  • XML Element start, named RootElement, with an attribute param equal to "value".
  • XML Element start, named FirstElement
  • XML Text node, with data equal to "Some Text" (note: text processing, with regard to spaces, can be changed).
  • XML Element end, named FirstElement
  • XML Element start, named SecondElement, with an attribute param2 equal to "something".
  • XML Text node, with data equal to "Pre-Text".
  • XML Element start, named Inline.
  • XML Text node, with data equal to "Inlined text".
  • XML Element end, named Inline.
  • XML Text node, with data equal to "Post-text.".
  • XML Element end, named SecondElement.
  • XML Element end, named RootElement.

Definition

Unlike DOM, there is no formal specification for SAX. The Java implementation of SAX is considered to be normative, and implementations in other languages attempt to follow the rules laid down in that implementation, adjusting for the differences in language where necessary. “Java language” redirects here. ... In philosophy, normative is usually contrasted with positive, descriptive or explanatory when describing types of theories, beliefs, or statements. ...


Benefits

SAX parsers have certain benefits over DOM-style parsers. The quantity of memory that a SAX parser must use in order to function is typically much smaller than that of a DOM parser. DOM parsers must have the entire tree in memory before any processing can begin, so the amount of memory used by a DOM parser depends entirely on the size of the input data. The memory footprint of a SAX parser, by contrast, is based only on the maximum depth of the XML file (the maximum depth of the XML tree) and the maximum data stored in XML attributes on a single XML element. Both of these are always smaller than the size of the parsed tree itself. To meet Wikipedias quality standards, this article or section may require cleanup. ...


Because of the event-driven nature of SAX, processing documents can often be faster than DOM-style parsers. Memory allocation takes time, so the larger memory footprint of the DOM is also a performance issue.


Due to the nature of DOM, streamed reading from disk is impossible. Processing XML documents that could never fit into memory is only possible through the use of a SAX parser (or another kind of stream XML parser).


Drawbacks

The event-driven model of SAX is useful for XML parsing, but it does have certain drawbacks.


Certain kinds of XML validation requires access to the document in full. For example, a DTD IDREF attribute requires that there be an element in the document that uses the given string as a DTD ID attribute. To validate this in a SAX parser, one would need to keep track of every previously encountered ID attribute and every previously encountered IDREF attribute, to see if any matches are made. Furthermore, if an IDREF does not match an ID, the user only discovers this after the document has been parsed; if this linkage was important to building functioning output, then time has been wasted in processing the entire document only to throw it away. The XML Validation (eXtensible Markup Language) checks a document in XML language if it is well formed and it is adjusted to a defined structure. ... Document Type Definition (DTD), defined slightly differently within the XML and SGML (the language XML was derived from) specifications, is one of several SGML and XML schema languages, and is also the term used to describe a document or portion thereof that is authored in the DTD language. ...


Additionally, some kinds of XML processing simply require having access to the entire document. XSLT and XPath, for example, need to be able to access any node at any time in the parsed XML tree. While a SAX parser could be used to construct such a tree, the DOM already does so by design. ... XPath (XML Path Language) is an expression language for addressing portions of an XML document, or for computing values (strings, numbers, or boolean values) based on the content of an XML document. ...


See also

Other XML processing technologies

VTD-XML (http://vtd-xml. ... // Hierarchy of Objects in the DOM - Document Object Model The Document Object Model (DOM) is a platform- and language-independent standard object model for representing HTML or XML and related formats. ... Diagram of the basic elements and process flow of Extensible Stylesheet Language Transformations. ... Streaming Transformations for XML (STX) is an XML transformation language intended as a high-speed, low memory consumption alternative to XSLT. // Overview STX is an XML standard for efficient processing of stream-based XML. As we will discover XSLT is not well suited to stream based processing and STX fills...

XML Parser and APIs supporting SAX

Xerces is a family of software packages for parsing and manipulating XML, part of the Apache XML project. ... Microsoft XML Core Services (MSXML) is a set of services that allow applications written in JScript, VBScript, and Microsoft development tools to build Windows-native XML-based applications. ... The Java API for XML Processing, or JAXP, is one of the Java XML programming APIs. ... libXML is a library for parsing XML documents. ...

References

  • David Brownell: SAX2, O'Reilly, ISBN 0-596-00237-8
  • W. Scott Means, Michael A. Bodie: The Book of SAX, No Starch Press, ISBN 1-886411-77-8

External links

  • SAX homepage
  • Top Ten SAX2 Tips
  • [1]
  • Interfaces for ...

  Results from FactBites:
 
Simple API for XML - Wikipedia, the free encyclopedia (238 words)
This is attributed to the fact that a SAX stream has a minuscule memory footprint compared to that of a fully constructed DOM tree.
The SAX parser is implemented as an event-driven model in which the programmer provides callback methods which are invoked by the parser as part of its traversal of the XML document.
SAX was developed collaboratively on the xml-dev mailing list, with no formal committee structure, but was quickly implemented by major companies working with XML.
  More results at FactBites »


 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments
Please enter the 5-letter protection code

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.