Justin Robinson | 2 Jan 2005 15:34

Problem caching grammars

Hi there....

I have managed to preparse my XML Schema and have put it in a grammar pool,
according to the active caching approach descrbied at
http://xml.apache.org/xerces2-j/faq-grammars.html#faq-1

I'm expecting the time taken to set up my SAX parser to increase, which it
does, so that's fine.
I'm also expecting the time taken on the first parse call to decrease.

This is where my problem is. The first parse still takes an average of about
7 times longer than subsequent parses.

What else must I do to bring down the time taken for the first parse??

I tried to look at the source code, but I'm having trouble locating where
the time might be taken up (still learning how to debug). The path of
execution goes through these classes:

1. AbstractSAXParser
2. XMLParser
3. XML11Configuration (methods parse() and configurePipeline())

Any ideas?

Here's how I set up the grammar pool:

   private XMLGrammarPool getGrammarPool() throws IOException {
      // create grammar preparser

(Continue reading)

Peter B. West | 3 Jan 2005 16:04
Picon
Gravatar

Problems running example DOM3 parser creation

Following the example, on the Programming with DOM page, for creating a 
DOM3 LS parser, I get a
ClassCastException: org.apache.xerces.dom.DOMImplementationSourceImpl
     at org.w3c.dom.bootstrap.DOMImplementationRegistry.newInstance(
				DOMImplementationRegistry.java:144)

i.e. at
	DOMImplementationSource source =
	    (DOMImplementationSource) sourceClass.newInstance();

The code, taken from the example, is
       DOMImplementationRegistry registry = null;
       System.setProperty(DOMImplementationRegistry.PROPERTY,
           "org.apache.xerces.dom.DOMImplementationSourceImpl");
       try {
           registry = DOMImplementationRegistry.newInstance();
       } catch (Exception e) {
           throw new RuntimeException(e);
       }

Any idea what I'm doing wrong?

Peter
csaba.szucs | 3 Jan 2005 16:32

Re: Problems running example DOM3 parser creation


hello,

I came across the same problem recently, I guess.

I think, You have to use endorsed jar files = You have to make a folder called "endorsed" below <Your JDK home>\jre\lib and You have to put the dom jars there:
 - dom3-xml-apis.jar
- and dom3-xercesImpl.jar, as well!

See below!




Kind regards,

Csaba Szucs




"Peter B. West" <lists <at> pbw.id.au>

01/03/2005 04:04 PM
Please respond to xerces-j-user

       
        To:        xerces-j-user <at> xml.apache.org
        cc:        (bcc: Csaba Szucs/ve/eu/au/cag)
        Subject:        Problems running example DOM3 parser creation



Following the example, on the Programming with DOM page, for creating a
DOM3 LS parser, I get a
ClassCastException: org.apache.xerces.dom.DOMImplementationSourceImpl
    at org.w3c.dom.bootstrap.DOMImplementationRegistry.newInstance(
                                                                   DOMImplementationRegistry.java:144)

i.e. at
                DOMImplementationSource source =
                    (DOMImplementationSource) sourceClass.newInstance();

The code, taken from the example, is
      DOMImplementationRegistry registry = null;
      System.setProperty(DOMImplementationRegistry.PROPERTY,
          "org.apache.xerces.dom.DOMImplementationSourceImpl");
      try {
          registry = DOMImplementationRegistry.newInstance();
      } catch (Exception e) {
          throw new RuntimeException(e);
      }

Any idea what I'm doing wrong?

Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe <at> xml.apache.org
For additional commands, e-mail: xerces-j-user-help <at> xml.apache.org



Endorsed Standards Override Mechanism

Documentation Contents

Introduction

An endorsed standard is a JavaTM API defined through a standards process other than the Java Community ProcessSM (JCPSM). Because endorsed standards are defined outside the JCP, it is anticipated that such standards may be revised between releases of the Java 2 Platform. In order to take advantage of new revisions to endorsed standards, developers and software vendors may use the Endorsed Standards Override Mechanism to provide newer versions of an endorsed standard than those included in the Java 2 Platform as released by Sun Microsystems.

Endorsed Standards Classes Deployment

Classes implementing newer versions of endorsed standards should be placed in JAR files. The system property java.endorsed.dirs specifies one or more directories that the Java runtime environment will search for such JAR files. If more than one directory path is specified by java.endorsed.dirs, they must be separated by File.pathSeparatorChar. If no value is set for java.endorsed.dirs, then Sun Microsystem's implementation of the Java 2 Platform looks for JAR files in a default standard location:
<java-home>\lib\endorsed [Microsoft Windows] <java-home>/lib/endorsed [Solaris or Linux]
Here <java-home> refers to the directory where the runtime software is installed (which is the top-level directory of the Java 2 Runtime Environment or the jre directory in the Java 2 SDK).

The Java runtime environment will use classes in such JAR files to override the corresponding classes provided in the Java 2 Platform as shipped by Sun.

Endorsed Standards APIs

The endorsed standards for J2SETM 1.4 constitute all classes and interfaces that are defined in the packages listed below. Classes and interfaces defined in sub-packages of listed packages are not endorsed standards unless those sub-packages are themselves listed. The Endorsed Standards Override Mechanism may be used to override the J2SE platform packages in the list below, and these packages may be overriden only by versions of the endorsed standard that are newer than that provided by the Java 2 Platform as released by Sun. No other packages from the J2SE platform API specification may be overridden.
javax.rmi.CORBA org.omg.CORBA org.omg.CORBA.DynAnyPackage org.omg.CORBA.ORBPackage org.omg.CORBA.portable org.omg.CORBA.TypeCodePackage org.omg.CORBA_2_3 org.omg.CORBA_2_3.portable org.omg.CosNaming org.omg.CosNaming.NamingContextExtPackage org.omg.CosNaming.NamingContextPackage org.omg.Dynamic org.omg.DynamicAny org.omg.DynamicAny.DynAnyFactoryPackage org.omg.DynamicAny.DynAnyPackage org.omg.IOP org.omg.IOP.CodecFactoryPackage org.omg.IOP.CodecPackage org.omg.Messaging org.omg.PortableInterceptor org.omg.PortableInterceptor.ORBInitInfoPackage org.omg.PortableServer org.omg.PortableServer.CurrentPackage org.omg.PortableServer.POAManagerPackage org.omg.PortableServer.POAPackage org.omg.PortableServer.portable org.omg.PortableServer.ServantLocatorPackage org.omg.SendingContext org.omg.stub.java.rmi org.w3c.dom org.xml.sax org.xml.sax.ext org.xml.sax.helpers
In addition to the packages listed above, which are part of the J2SE specification, users of Sun's J2SE Reference Implementation may be allowed to use the Endorsed Standards Override Mechanism to override implementation-specific classes such as the org.w3c.dom sub-packages delivered in Sun's Reference Implementation. See the corresponding license for details.

Copyright © 2002 Sun Microsystems, Inc. All Rights Reserved.

Please send comments to: j2se-comments <at> sun.com

Java Software
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe <at> xml.apache.org
For additional commands, e-mail: xerces-j-user-help <at> xml.apache.org
Verachten Bruno | 3 Jan 2005 17:52

Using Xerces as a validating parser with SAX events

Hi,

I have a class that generates XML into a Writer from a Java object
tree. I will modify it so that it generates SAX events.
I'd like to use the Xerces validation engine to check the generated
XML, but I just can't see how to plug it into my code.
I know I could do an identity transformation to get it running, but I
would prefer not having any dependancy with Xalan.
Did I miss something, or can't it be done?
Thanks.

Bruno Verachten
Justin Robinson | 3 Jan 2005 18:23

Re: Using Xerces as a validating parser with SAX events

Hi,

When you say the class generates XML into a Writer from a tree, what exactly
do you mean?
Do you mean that the class can write the generated XML to an output stream?

Justin

----- Original Message ----- 
From: "Verachten Bruno" <Bruno.Verachten <at> atosorigin.com>
To: <xerces-j-user <at> xml.apache.org>
Sent: Monday, January 03, 2005 4:52 PM
Subject: Using Xerces as a validating parser with SAX events

Hi,

I have a class that generates XML into a Writer from a Java object
tree. I will modify it so that it generates SAX events.
I'd like to use the Xerces validation engine to check the generated
XML, but I just can't see how to plug it into my code.
I know I could do an identity transformation to get it running, but I
would prefer not having any dependancy with Xalan.
Did I miss something, or can't it be done?
Thanks.

Bruno Verachten

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe <at> xml.apache.org
For additional commands, e-mail: xerces-j-user-help <at> xml.apache.org
Peter B. West | 3 Jan 2005 22:38
Picon
Gravatar

Re: Problems running example DOM3 parser creation

Thanks Csaba.

I have used the /endorsed directory, but I have the standard 2.6.2 
distribution.  The page I quoted says that I can use the Load and Save 
functionality without having to recompile.  Maybe not.

Peter

csaba.szucs <at> contiteves.com wrote:
> 
> hello,
> 
> I came across the same problem recently, I guess.
> 
> I think, You have to use endorsed jar files = You have to make a folder 
> called "endorsed" below <Your JDK home>\jre\lib and You have to put the 
> dom jars there:
>  - dom3-xml-apis.jar
> - and dom3-xercesImpl.jar, as well!
> 
> See below!
> 
> 
> 
> 
> Kind regards,
> 
> Csaba Szucs
> 
> 
> 
> 
> 	*"Peter B. West" <lists <at> pbw.id.au>*
> 
> 01/03/2005 04:04 PM
> Please respond to xerces-j-user
> 
> 	       
>         To:        xerces-j-user <at> xml.apache.org
>         cc:        (bcc: Csaba Szucs/ve/eu/au/cag)
>         Subject:        Problems running example DOM3 parser creation
> 
> 
> 
> 
> Following the example, on the Programming with DOM page, for creating a
> DOM3 LS parser, I get a
> ClassCastException: org.apache.xerces.dom.DOMImplementationSourceImpl
>     at org.w3c.dom.bootstrap.DOMImplementationRegistry.newInstance(
>                                                                   
>  DOMImplementationRegistry.java:144)
> 
> i.e. at
>                 DOMImplementationSource source =
>                     (DOMImplementationSource) sourceClass.newInstance();
> 
> The code, taken from the example, is
>       DOMImplementationRegistry registry = null;
>       System.setProperty(DOMImplementationRegistry.PROPERTY,
>           "org.apache.xerces.dom.DOMImplementationSourceImpl");
>       try {
>           registry = DOMImplementationRegistry.newInstance();
>       } catch (Exception e) {
>           throw new RuntimeException(e);
>       }
> 
> Any idea what I'm doing wrong?
Xiaoming Liu | 4 Jan 2005 00:14

read byte offset information during xml parsing

hi,

I am looking for a Java XML parser which supports reading byte offset
information during xml parsing, e.g. in '<foo><bar></bar></foo>', the
parser can report '<bar>' starts from byte 5; and '</bar>' starts from
byte 10 .

I went through standard APIs like DOM, SAX, and XMLPull and cannot find
related APIs. In Sax, the nearest interface is org.xml.sax.Locator. I also
checked Xerces XNI and found the nearest class is
org.apache.xerces.xni.XMLLocator. In either class, only line number and
column number are reported.

However, similar functions are provided in other languages, such as the
"XML_GetCurrentByteIndex" of expat parser (C, perl).

so my question is whether there is a Java XML Parser reporting byte offset
information during parsing, and if not, is there any plan to
implement this feature?

many thanks,
Xiaoming
Suresh Babu Koya | 4 Jan 2005 05:28

RE: read byte offset information during xml parsing

If you are using UTF-16 encoding then probably <bar>
will not start at byte 5. Do you want the character
position or the byte position?

Also if you have the BOM Marker for Unicode at the start
of the Stream then again the byte position varies.
I think that was one reason why xerces-j was not providing 
that kind API. 

/Suresh

>>-----Original Message-----
>>From: Xiaoming Liu [mailto:liu_x <at> lanl.gov]
>>Sent: Tuesday, January 04, 2005 4:44 AM
>>To: xerces-j-user <at> xml.apache.org
>>Subject: read byte offset information during xml parsing
>>
>>
>>hi,
>>
>>I am looking for a Java XML parser which supports reading byte offset
>>information during xml parsing, e.g. in '<foo><bar></bar></foo>', the
>>parser can report '<bar>' starts from byte 5; and '</bar>' starts from
>>byte 10 .
>>
>>I went through standard APIs like DOM, SAX, and XMLPull and 
>>cannot find
>>related APIs. In Sax, the nearest interface is 
>>org.xml.sax.Locator. I also
>>checked Xerces XNI and found the nearest class is
>>org.apache.xerces.xni.XMLLocator. In either class, only line 
>>number and
>>column number are reported.
>>
>>However, similar functions are provided in other languages, 
>>such as the
>>"XML_GetCurrentByteIndex" of expat parser (C, perl).
>>
>>so my question is whether there is a Java XML Parser 
>>reporting byte offset
>>information during parsing, and if not, is there any plan to
>>implement this feature?
>>
>>many thanks,
>>Xiaoming
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: xerces-j-user-unsubscribe <at> xml.apache.org
>>For additional commands, e-mail: xerces-j-user-help <at> xml.apache.org
>>
>>
Michael Glavassevich | 4 Jan 2005 05:26
Picon

Re: read byte offset information during xml parsing

Hello Xiaoming,

In general byte offsets aren't available to the parser. The document 
scanners read from a java.io.Reader so the byte to character decoding is 
being done at a lower level. The parser only sees the decoded characters. 
If what you actually want is the character offset, we made some changes to 
XNI last year (they're in CVS) to expose this information in 
org.apache.xerces.xni.XMLLocator. If you're using DOM Level 3, we've also 
made the character offset available to DOMLocator [1]. You'll get this 
functionality from the latest jars [2].

[1] 
http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#Interfaces-DOMLocator
[2] http://brutus.apache.org/gump/public-jars/xml-xerces2/jars/

Xiaoming Liu <liu_x <at> lanl.gov> wrote on 01/03/2005 06:14:25 PM:

> hi,
> 
> I am looking for a Java XML parser which supports reading byte offset
> information during xml parsing, e.g. in '<foo><bar></bar></foo>', the
> parser can report '<bar>' starts from byte 5; and '</bar>' starts from
> byte 10 .
> 
> I went through standard APIs like DOM, SAX, and XMLPull and cannot find
> related APIs. In Sax, the nearest interface is org.xml.sax.Locator. I 
also
> checked Xerces XNI and found the nearest class is
> org.apache.xerces.xni.XMLLocator. In either class, only line number and
> column number are reported.
> 
> However, similar functions are provided in other languages, such as the
> "XML_GetCurrentByteIndex" of expat parser (C, perl).
> 
> so my question is whether there is a Java XML Parser reporting byte 
offset
> information during parsing, and if not, is there any plan to
> implement this feature?
> 
> many thanks,
> Xiaoming
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-user-unsubscribe <at> xml.apache.org
> For additional commands, e-mail: xerces-j-user-help <at> xml.apache.org
> 

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas <at> ca.ibm.com
E-mail: mrglavas <at> apache.org
Michael Glavassevich | 4 Jan 2005 05:42
Picon

Re: Problems running example DOM3 parser creation

Hello Peter,

The standard distribution for Xerces 2.6.2 contains some hacked DOM Level 
3 interfaces which allow both DOM Level 2 and 3 to co-exist. You should be 
using the DOM Level 3 distribution if you want access to DOM Level 3. The 
package containing the jars you need is called 
beta2-dom3-Xerces-J-bin.2.6.2.zip and is available on the download site 
and mirrors.

"Peter B. West" <lists <at> pbw.id.au> wrote on 01/03/2005 04:38:16 PM:

> Thanks Csaba.
> 
> I have used the /endorsed directory, but I have the standard 2.6.2 
> distribution.  The page I quoted says that I can use the Load and Save 
> functionality without having to recompile.  Maybe not.
> 
> Peter
> 
> csaba.szucs <at> contiteves.com wrote:
> > 
> > hello,
> > 
> > I came across the same problem recently, I guess.
> > 
> > I think, You have to use endorsed jar files = You have to make a 
folder 
> > called "endorsed" below <Your JDK home>\jre\lib and You have to put 
the 
> > dom jars there:
> >  - dom3-xml-apis.jar
> > - and dom3-xercesImpl.jar, as well!
> > 
> > See below!
> > 
> > 
> > 
> > 
> > Kind regards,
> > 
> > Csaba Szucs
> > 
> > 
> > 
> > 
> >    *"Peter B. West" <lists <at> pbw.id.au>*
> > 
> > 01/03/2005 04:04 PM
> > Please respond to xerces-j-user
> > 
> > 
> >         To:        xerces-j-user <at> xml.apache.org
> >         cc:        (bcc: Csaba Szucs/ve/eu/au/cag)
> >         Subject:        Problems running example DOM3 parser creation
> > 
> > 
> > 
> > 
> > Following the example, on the Programming with DOM page, for creating 
a
> > DOM3 LS parser, I get a
> > ClassCastException: org.apache.xerces.dom.DOMImplementationSourceImpl
> >     at org.w3c.dom.bootstrap.DOMImplementationRegistry.newInstance(
> > 
> >  DOMImplementationRegistry.java:144)
> > 
> > i.e. at
> >                 DOMImplementationSource source =
> >                     (DOMImplementationSource) 
sourceClass.newInstance();
> > 
> > The code, taken from the example, is
> >       DOMImplementationRegistry registry = null;
> >       System.setProperty(DOMImplementationRegistry.PROPERTY,
> >           "org.apache.xerces.dom.DOMImplementationSourceImpl");
> >       try {
> >           registry = DOMImplementationRegistry.newInstance();
> >       } catch (Exception e) {
> >           throw new RuntimeException(e);
> >       }
> > 
> > Any idea what I'm doing wrong?
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-user-unsubscribe <at> xml.apache.org
> For additional commands, e-mail: xerces-j-user-help <at> xml.apache.org
> 

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas <at> ca.ibm.com
E-mail: mrglavas <at> apache.org

Gmane