XML Parsing w/ SAX – JAVA Tutorial

XML Parsing w/ SAX – JAVA Tutorial


in this tutorial i will be doing an introduction
to xml parsing and so first I creating an xml sax parsing
application and running that application in order to printout the contents of any supplied
xml file so i will be step by step creating this application and 2nd i will be varying the contents of the xml file while running
the application and printing out the parsed xml and 3rd i will be doing discussing and explaining SAX2 in the context of the above code so essentially go thru the
above code in details and use that in order to explain SAX2 i’ll start by creating
a JAVA project and will call this project SAXTutorial and i will create a JAVA class and call it
ZaneAcademyHandler i will extend from DefaultHandler and fix this one by importing
this and i will go ahead and override these methods from the DefaultHandler so startDocument
endDocument and startElement and also i will create
a Driver class so it will have
a main method in it and i will come back and finish the code here but for now let me create an xml file and this will be the xml file that i will be parsing using this application
so let’s do new and i will put it under the project here so i will call it zaneacademy.xml
and let me look at the source code here so i will
be populating this xml file with these tags so essentially i have a channel this is the
root tag and the name of the channel is zaneacademy and i have topics inside it so i have 2 topics
a JAVA Language topic and let me create another topic here so this will be
a Design Patterns topic so 2 topics this one is JAVA topic inside
it i have 3 different tutorials equals and hashcode methods tutorial introduction
to RMI tutorial and introduction to sockets and in the Design Patterns topic i have also
3 different tutorials actually let remove this so i have a factory method pattern
tutorial and an abstract factory pattern tutorial so 2 tutorials here and the name of this topic
is design patterns and here i have 3 tutorials and the name of this topic is JAVA language
so essentially what this tutorial is going to do i will be going through this xml file
using SAX and printing it out to the console as a validation that
it was read so now let me go back to the handler class ZaneAcademyHandler so
here in startDocument i am just going to printout begin parsing document so this method is essentially
going to get invoked when the parser starts parsing the document and endDocument this
method is going to get invoked when the parser ends parsing the document
so i am just going to printout this here when it does that so
whenever it starts parsing an element inside this document the startElement is going
to get invoked and when it ends parsing that element the endElement
is going to get invoked and this method the characters method is going
to be invoked let’s say you want to printout what is inside the elment let’s say there
is text inside the element so you want to printout that text inside the element so you
can print it out from here from this characters method
so essentially here what i am doing is printing out so qName is going to give me
the tag so i am printing out open tag and close this tag here and than the tag name
and i am going to be doing the same thing when endElement is invoked except this would
be a closing tag and here in order to printout the text that is inside the tag i will go through this character array and essentially print them out from the start
to the length so start + length than i am going to printout everything in that array
next let me go back to the Driver and i am going to add this code to it so now i
have this code and so if we run it we should be able to printout what’s inside this zaneacademy.xml
so let’s go ahead and run this main method and i’ll be going into detail what’s inside
this main method shortly so here we go so i printed out begin parsing document when
i started parsing the document and end parsing document when finished parsing the document
and essentially so here we have start document and end document end parsing document and
and than i have a channel inside i have the name of the channel and than the topics so
this is the first topic JAVA Language and equals intro to RMI intro to Sockets and another
topic which is Design Patterns factory method pattern and abstract factory ok and just to
show that it is not dependent on what is inside this this xml file let me modify it let me
go back here and maybe just have one here and change Design Patterns to maybe Object
Oriented and maybe take out one of these guys also
here and let’s see when we print it out if it is going to parse it correctly so let me
go ahead and run it so here we go so i have Java Language now i have only two and here
i have Object Oriented Design Patterns and i have just one tutorial inside this topic
let me do another just to show let’s say if there is only one topic inside it so and let’s
run it again so here we go we have just one topic and i
have what is inside that topic ok let me go ahead and discuss the Driver class and the
ZaneAcademyHandler class so let me start with the Driver class so inside this class i have
a main method and inside that main method what i’m doing is i’m using the XMLReaderFactory
in order to call a static method on this class which is createXMLReader which would return
me an XMLReader object ok so this method this class the XMLReaderFactory class has 2 methods
has 2 createXMLReader methods one of the methods which i’m using here doesn’t have any parameters
and what it does is it creates an XMLReader from system default and there is another method
that takes in a String and creates an XMLReader from class name ok so very good so i used
the XMLReaderFactory to call the private methodcreateXMLReader to give me back an XMLReader now i have an
XMLReader this XMLReader has several methods including 2 methods that i’m using here the
setContentHandler and the parse ok so the setContentHandler what it does it allows an
application to register a content event handler and that content event handler that i’m registering
here is this guy the ZaneAcademyHandler and it is a content event handler since it extends
this guy DefaultHandler so i registered this content event handler now i can use it to
parse this document the xml the zaneacademy.xml file ok so the parse method it parses an xml
document so essentially here i am not specifying any relative location for where my zaneacademy.xml
file is because of where i placed it in the project so i am just specifying the name here
so i called the parse method now what happens is these methods start being called on my
ZaneAcademyHandler class so startDocument and after the startDocument i have several
startElement and than characters and than endElement method being called and than at
the end the endDocument method is called so essentially i want to model it i have a startDocument
method and i have an endDocument and in between i have a let me chage the color
i have a startElement and i have an endElement and in between startElement and endElement
i have this method the characters method ok so i am going to have several of these startElement
and endElement inside the startDocument and endDocument so the startDocument what it does
is this method receives notifications at the beginning of the document and you can use
this method in order you can override this method in order to take specific action at
the beginning of the document like creating an output file now endDocument same thing
same thing except maybe you want to close an output file you would receive notifications
at the end of the document how about startElement with startElement you would receive notifications
of the start of an element so you can override this method like i did here in order to take
specific action at the start of each element like for example writing output to a file
and endElement same thing you would you can take specific actions at the end of an element
like write to an output file also and characters essentially you would receive notifications
of character data inside this element and you would override this method to take specific
action like for example printing a chunk of character data to a file like i am doing here
what i am doing is printing so if we have so for example here in the zaneacademy.xml
i can use this method i can use the characters method in order to printout these guys the
JAVA Language the equals and hashcode method which are inside these elements the name element
or tutorial element etc…

27 thoughts on “XML Parsing w/ SAX – JAVA Tutorial”

  1. Much easier to understand than the SAX tutorial "Parsing an XML File Using SAX" – The Java Tutorials. Thanks!

  2. Thanks for the tutorial. On my eclipse IDE the print out to the console is showing the XML file all in one line, ignoring the carriage return between tags and the tabs. It does not look as pretty as your does. DO you know what I'm doing wrong?

    Thanks,

    Dan

  3. "Start document" creates a document; "Start element" creates title, links,, or description; "End element" declares that the element is complete per item; and, "End document" completes the set of list. What is "Characters" stand for? I observe it at 9:33.

Leave a Reply

Your email address will not be published. Required fields are marked *