shibboleth-dev - DOMParser
Subject: Shibboleth Developers
List archive
- From: "Howard Gilbert" <>
- To: <>
- Subject: DOMParser
- Date: Fri, 19 Nov 2004 11:34:42 -0500
For the last two days I have been getting more and more
entangled in a cleanup issue. The original idea was simple. We depend on DOM3. DOM3 is
standard in Java 1.5. So, suppose one compiles this code and runs it with Java
1.5 but without separate Xerces DOM3 libraries. The problem is that both Shibboleth and OpenSAML don't use
JAXP protocol to get a DOM Parser. This is perfectly reasonable given the long
history of Xerces development and the slow evolution of JAXP. However, since
the need for DOM3 forces you to include a version of Xerces that does support
JAXP, there is no longer a supported configuration where JAXP would not work if
used. The old way of doing business, used in current code,
imports a Xerces class directly import org.apache.xerces.parsers.DOMParser; Then it creates an object of the class private
DOMParser parser = new DOMParser(); Then it sets features using an apache syntax parser.setFeature("http://xml.org/sax/features/validation",
true); parser.setFeature("http://apache.org/xml/features/validation/schema",
true); The new JAXP approach uses DocumentBuilder Factories:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
dbf.setValidating(true); A few features have their own methods. Others take URI named
attributes as before, but note that the method name changes and the attributes
may be slightly different and follow a Sun rather than Apache name: static final String JAXP_SCHEMA_LANGUAGE
=
"http://java.sun.com/xml/jaxp/properties/schemaLanguage"; static final String W3C_XML_SCHEMA =
"http://www.w3.org/2001/XMLSchema"; static final String JAXP_SCHEMA_SOURCE =
"http://java.sun.com/xml/jaxp/properties/schemaSource"; try {
// Say we are using XSD, not DTD
dbf.setAttribute(JAXP_SCHEMA_LANGUAGE, W3C_XML_SCHEMA); } catch
(IllegalArgumentException x) {
log.error("Unable to obtain usable XML parser from environment");
return null; } // Set the Schema
file
dbf.setAttribute(JAXP_SCHEMA_SOURCE, new File(schemaSource)); The schemaSource can be a String (filename/URI) or an array
of Strings. This may not be the time to raise it, but there is a philosophical
dispute between me and prior coders. Some believe that the XML file should
nominate its own Schema. The config file contains an XSD reference which is
resolved by the EntityResolver. I, however, believe that the code expects the
XML file to conform to a prespecified schema (Metadata, AAP, whatever). I think
the code should force the XML file to conform to the schema that the code
expects, and not be allowed to reference its own schema file which may not be
what the code expects. When you adopt this model, you either specify the base
schema file with the JAXP_SCHEMA_SOURCE and let the secondary files be found
with EntityResolver or you specify all the schema files with an array and then
don't let the EntityResolver find anything. However, lets table that discussion
for a while because it is a tangent off what I want to talk about. Once you set the features you want, you ask the factory to
give you a parser. Note that "DocumentBuilder" is simply an updated
name for "DOMParser". Errors at this point mean that the JAXP
environment is set wrong DocumentBuilder
parser; try {
parser = dbf.newDocumentBuilder(); parser.setErrorHandler(new
SimpleErrorHandler());
parser.setEntityResolver(new Resolver()); } catch
(ParserConfigurationException e) {
log.error("Unable to obtain usable XML parser from environment");
return null; } Finally, you parse the file. try {
doc = parser.parse(ins); } catch
(SAXException e1) {
log.error("Error in XML configuration file"+e1);
return null; } catch
(IOException e1) {
log.error("Error accessing XML configuration file"+e1);
return null; } Those intimately familiar with the prior interface will note
that the Document object is returned from parser.parse() here, where as before
you had to call parser.getDocument(); So what is the fallout? Well, this turns out to be a bigger
problem than I thought. I started with the Shibboleth /src, but now I find that
there are direct uses of DOMParser in OpenSAML and in the test cases. I
still think there is light at the end of the tunnel, but before continuing
forward I better ask the list and decide to proceed or to stop and Rollback the
transaction. If I do Rollback, then the result is that we will continue
to require a specific Xerces DOM3 library to be added to the Tomcat common/endorsed
library even when a suitable DOM3 compliant JAXP parser is present in the
environment. It's not a big deal now, but this is an area of Eternal Install
Anguish. If I proceed, then I end up with an update that hits some
code (in the Origin and Target, some test cases, and OpenSAML). It is a sloppy
commit, but once done we have removed an explicit org.apache.xerces import/new
and switched to a standard Java Factory interface. I believe at this point that there are no features or
options that behave differently or cannot be accessed through the new
interface. This is subject to testing and verification. Unfortunately, when
stuff moves into the Java standard you have to "Render unto Sun that which
is Sun's", and Sun doesn't always see 100% the same as Apache. So here is the question. If I proceed forward, finish the
edit, and run some successful tests, will the consensus allow me to check in
the changes so we can run some more tests. If there is an aesthetic objection
to flip over to JAXP, then it makes no sense to continue what has become a
non-trivial effort. Note that moving to the JAXP interface doesn't necessarily
stick us with any particular Sun implementation. If Xerces 3 provides extra
function, you can always add some later Xerces library to /endorsed and make it
a prerequisite. It's just that we will be accessing that Xerces library using
the JAXP interface and not directly through the org.apache.xerces classes. |
- DOMParser, Howard Gilbert, 11/19/2004
- RE: DOMParser, Scott Cantor, 11/19/2004
- RE: DOMParser, Howard Gilbert, 11/19/2004
- RE: DOMParser, Scott Cantor, 11/22/2004
- RE: DOMParser, Howard Gilbert, 11/23/2004
- RE: DOMParser, Scott Cantor, 11/22/2004
- RE: DOMParser, Howard Gilbert, 11/19/2004
- Re: DOMParser, Walter Hoehn, 11/19/2004
- RE: DOMParser, Scott Cantor, 11/19/2004
Archive powered by MHonArc 2.6.16.