carla.spruit | 8 Mar 16:15 2011

Different behavior entity resolver Xerces 2.11.0 and Xerces 2.10.0

Hi,

 

When parsing an XML document "personal-schema.xml" with xsi:noNamespaceSchemaLocation="personal.xsd" with the code below, I get different systemId parameter values in the LSResourceResolver in version 2.11.0 and 2.10.0 :

 

    DOMImplementationLS implementationLS = new DOMImplementationImpl();

    LSParser builder = implementationLS.createLSParser(DOMImplementationLS.MODE_SYNCHRONOUS, null);

 

    LSResourceResolver resolver = new LSResourceResolver() {

 

      public LSInput resolveResource(String type, String namespace, String publicId,

          String systemId, String baseURI) {

        // in Xerces 2.11.0, systemId is expanded: ‘file:/C:/my/path/to/doc/personal.xsd’

        // in Xerces 2.10.0, the value is exactly the same as declared in the document: ‘personal.xsd’

        return null;

      }

    };

 

    builder.getDomConfig().setParameter("validate", Boolean.TRUE);

    builder.getDomConfig().setParameter("resource-resolver", resolver);

    Document doc = builder.parseURI(getURL("personal-schema.xml").toString());

 

Is this a bug?

 

Thanks!

Carla

Michael Glavassevich | 9 Mar 06:07 2011
Picon

Re: Different behavior entity resolver Xerces 2.11.0 and Xerces 2.10.0

Hi Carla,

The change in behaviour is most likely due to the fix for this JIRA issue [1]. From a quick glance at the DOM Level 3 Load & Save spec [2] I didn't see anything which forbids expansion of the system ID before reporting it.

Thanks.

[1] https://issues.apache.org/jira/browse/XERCESJ-809
[2] http://www.w3.org/TR/DOM-Level-3-LS/load-save.html#LS-LSResourceResolver

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas <at> ca.ibm.com
E-mail: mrglavas <at> apache.org

<carla.spruit <at> emc.com> wrote on 03/08/2011 10:15:48 AM:

> Hi,
>  
> When parsing an XML document "personal-schema.xml" with
> xsi:noNamespaceSchemaLocation="personal.xsd" with the code below, I
> get different systemId parameter values in the LSResourceResolver in
> version 2.11.0 and 2.10.0 :
>  
>     DOMImplementationLS implementationLS = new DOMImplementationImpl();
>     LSParser builder = implementationLS.createLSParser
> (DOMImplementationLS.MODE_SYNCHRONOUS, null);
>  
>     LSResourceResolver resolver = new LSResourceResolver() {
>  
>       public LSInput resolveResource(String type, String namespace,
> String publicId,
>           String systemId, String baseURI) {
>         // in Xerces 2.11.0, systemId is expanded: ‘file:/C:/my/
> path/to/doc/personal.xsd’
>         // in Xerces 2.10.0, the value is exactly the same as
> declared in the document: ‘personal.xsd’
>         return null;
>       }
>     };
>  
>     builder.getDomConfig().setParameter("validate", Boolean.TRUE);
>     builder.getDomConfig().setParameter("resource-resolver", resolver);
>     Document doc = builder.parseURI(getURL("personal-schema.xml").toString());
>  
> Is this a bug?
>  
> Thanks!
> Carla

carla.spruit | 9 Mar 10:48 2011

RE: Different behavior entity resolver Xerces 2.11.0 and Xerces 2.10.0

Hi Michael,

 

Thank you for the answer. I can see why this is changed.

 

Just to be sure. I also see that when using the org.apache.xerces.xni.XMLEntityResolver and function resolveEntity(XMLResourceIdentifier resourceIdentifier), the value of resourceIdentifier.getLiteralSystemId() is now equal to resourceIdentifier.getExpandedSystemId(). The original location value found in the XML document or schema is no longer available in XMLResourceIdentifier.

 

Thanks,

Carla

 

From: Michael Glavassevich [mailto:mrglavas <at> ca.ibm.com]
Sent: 09 March 2011 06:08
To: j-users <at> xerces.apache.org
Subject: Re: Different behavior entity resolver Xerces 2.11.0 and Xerces 2.10.0

 

Hi Carla,

The change in behaviour is most likely due to the fix for this JIRA issue [1]. From a quick glance at the DOM Level 3 Load & Save spec [2] I didn't see anything which forbids expansion of the system ID before reporting it.

Thanks.

[1] https://issues.apache.org/jira/browse/XERCESJ-809
[2] http://www.w3.org/TR/DOM-Level-3-LS/load-save.html#LS-LSResourceResolver

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas <at> ca.ibm.com

E-mail: mrglavas <at> apache.org

<carla.spruit <at> emc.com> wrote on 03/08/2011 10:15:48 AM:

> Hi,

>  
> When parsing an XML document "personal-schema.xml" with
> xsi:noNamespaceSchemaLocation="personal.xsd" with the code below, I
> get different systemId parameter values in the LSResourceResolver in
> version 2.11.0 and 2.10.0 :

>  
>     DOMImplementationLS implementationLS = new DOMImplementationImpl();
>     LSParser builder = implementationLS.createLSParser
> (DOMImplementationLS.MODE_SYNCHRONOUS, null);

>  
>     LSResourceResolver resolver = new LSResourceResolver() {
>  
>       public LSInput resolveResource(String type, String namespace,
> String publicId,

>           String systemId, String baseURI) {
>         // in Xerces 2.11.0, systemId is expanded: ‘file:/C:/my/
> path/to/doc/personal.xsd’

>         // in Xerces 2.10.0, the value is exactly the same as
> declared in the document: ‘personal.xsd’

>         return null;
>       }
>     };
>  
>     builder.getDomConfig().setParameter("validate", Boolean.TRUE);
>     builder.getDomConfig().setParameter("resource-resolver", resolver);
>     Document doc = builder.parseURI(getURL("personal-schema.xml").toString());
>  
> Is this a bug?
>  
> Thanks!
> Carla

Michael Glavassevich | 9 Mar 13:48 2011
Picon

RE: Different behavior entity resolver Xerces 2.11.0 and Xerces 2.10.0

Hi Carla,

Based on what was changed for XERCESJ-809 that's what I would expect to see for xsi:noNamespaceSchemaLocation and xsi:schemaLocation hints in the document. They get loaded at some point later if they're required and Xerces (2.10.0) would sometimes use the wrong base URI due to the load request being in some other context than processing of the main document. Expansion of the URI before it's cached in an internal map solved that problem, but would have also introduced this behavioural change to the entity resolver.

Thanks.

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas <at> ca.ibm.com
E-mail: mrglavas <at> apache.org

<carla.spruit <at> emc.com> wrote on 03/09/2011 04:48:17 AM:

> Hi Michael,
>  
> Thank you for the answer. I can see why this is changed.
>  
> Just to be sure. I also see that when using the
> org.apache.xerces.xni.XMLEntityResolver and function resolveEntity
> (XMLResourceIdentifier resourceIdentifier), the value of
> resourceIdentifier.getLiteralSystemId() is now equal to
> resourceIdentifier.getExpandedSystemId(). The original location
> value found in the XML document or schema is no longer available in
> XMLResourceIdentifier.
>  
> Thanks,
> Carla
>  
> From: Michael Glavassevich [mailto:mrglavas <at> ca.ibm.com]
> Sent: 09 March 2011 06:08
> To: j-users <at> xerces.apache.org
> Subject: Re: Different behavior entity resolver Xerces 2.11.0 and
> Xerces 2.10.0
>  
> Hi Carla,
>
> The change in behaviour is most likely due to the fix for this JIRA
> issue [1]. From a quick glance at the DOM Level 3 Load & Save spec
> [2] I didn't see anything which forbids expansion of the system ID
> before reporting it.
>
> Thanks.
>
> [1] https://issues.apache.org/jira/browse/XERCESJ-809
> [2] http://www.w3.org/TR/DOM-Level-3-LS/load-save.html#LS-LSResourceResolver
>
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrglavas <at> ca.ibm.com
> E-mail: mrglavas <at> apache.org
>
> <carla.spruit <at> emc.com> wrote on 03/08/2011 10:15:48 AM:
>
> > Hi,
> >  
> > When parsing an XML document "personal-schema.xml" with
> > xsi:noNamespaceSchemaLocation="personal.xsd" with the code below, I
> > get different systemId parameter values in the LSResourceResolver in
> > version 2.11.0 and 2.10.0 :
> >  
> >     DOMImplementationLS implementationLS = new DOMImplementationImpl();
> >     LSParser builder = implementationLS.createLSParser
> > (DOMImplementationLS.MODE_SYNCHRONOUS, null);
> >  
> >     LSResourceResolver resolver = new LSResourceResolver() {
> >  
> >       public LSInput resolveResource(String type, String namespace,
> > String publicId,
> >           String systemId, String baseURI) {
> >         // in Xerces 2.11.0, systemId is expanded: ‘file:/C:/my/
> > path/to/doc/personal.xsd’
> >         // in Xerces 2.10.0, the value is exactly the same as
> > declared in the document: ‘personal.xsd’
> >         return null;
> >       }
> >     };
> >  
> >     builder.getDomConfig().setParameter("validate", Boolean.TRUE);
> >     builder.getDomConfig().setParameter("resource-resolver", resolver);
> >     Document doc = builder.parseURI(getURL("personal-
> schema.xml").toString());
> >  
> > Is this a bug?
> >  
> > Thanks!
> > Carla

carla.spruit | 9 Mar 13:56 2011

RE: Different behavior entity resolver Xerces 2.11.0 and Xerces 2.10.0

Hi Michael,

 

Thanks!

 

Carla

 

From: Michael Glavassevich [mailto:mrglavas <at> ca.ibm.com]
Sent: 09 March 2011 13:49
To: j-users <at> xerces.apache.org
Subject: RE: Different behavior entity resolver Xerces 2.11.0 and Xerces 2.10.0

 

Hi Carla,

Based on what was changed for XERCESJ-809 that's what I would expect to see for xsi:noNamespaceSchemaLocation and xsi:schemaLocation hints in the document. They get loaded at some point later if they're required and Xerces (2.10.0) would sometimes use the wrong base URI due to the load request being in some other context than processing of the main document. Expansion of the URI before it's cached in an internal map solved that problem, but would have also introduced this behavioural change to the entity resolver.

Thanks.

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas <at> ca.ibm.com

E-mail: mrglavas <at> apache.org

<carla.spruit <at> emc.com> wrote on 03/09/2011 04:48:17 AM:

> Hi Michael,

>  
> Thank you for the answer. I can see why this is changed.
>  
> Just to be sure. I also see that when using the
> org.apache.xerces.xni.XMLEntityResolver and function resolveEntity
> (XMLResourceIdentifier resourceIdentifier), the value of
> resourceIdentifier.getLiteralSystemId() is now equal to
> resourceIdentifier.getExpandedSystemId(). The original location
> value found in the XML document or schema is no longer available in
> XMLResourceIdentifier.

>  
> Thanks,
> Carla
>  
> From: Michael Glavassevich [mailto:mrglavas <at> ca.ibm.com]
> Sent: 09 March 2011 06:08
> To: j-users <at> xerces.apache.org
> Subject: Re: Different behavior entity resolver Xerces 2.11.0 and
> Xerces 2.10.0

>  
> Hi Carla,
>
> The change in behaviour is most likely due to the fix for this JIRA
> issue [1]. From a quick glance at the DOM Level 3 Load & Save spec
> [2] I didn't see anything which forbids expansion of the system ID
> before reporting it.
>
> Thanks.
>
> [1] https://issues.apache.org/jira/browse/XERCESJ-809
> [2] http://www.w3.org/TR/DOM-Level-3-LS/load-save.html#LS-LSResourceResolver
>
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrglavas <at> ca.ibm.com
> E-mail: mrglavas <at> apache.org
>
> <carla.spruit <at> emc.com> wrote on 03/08/2011 10:15:48 AM:
>
> > Hi,
> >  
> > When parsing an XML document "personal-schema.xml" with
> > xsi:noNamespaceSchemaLocation="personal.xsd" with the code below, I
> > get different systemId parameter values in the LSResourceResolver in
> > version 2.11.0 and 2.10.0 :
> >  
> >     DOMImplementationLS implementationLS = new DOMImplementationImpl();
> >     LSParser builder = implementationLS.createLSParser
> > (DOMImplementationLS.MODE_SYNCHRONOUS, null);
> >  
> >     LSResourceResolver resolver = new LSResourceResolver() {
> >  
> >       public LSInput resolveResource(String type, String namespace,
> > String publicId,
> >           String systemId, String baseURI) {
> >         // in Xerces 2.11.0, systemId is expanded: ‘file:/C:/my/
> > path/to/doc/personal.xsd’
> >         // in Xerces 2.10.0, the value is exactly the same as
> > declared in the document: ‘personal.xsd’
> >         return null;
> >       }
> >     };
> >  
> >     builder.getDomConfig().setParameter("validate", Boolean.TRUE);
> >     builder.getDomConfig().setParameter("resource-resolver", resolver);
> >     Document doc = builder.parseURI(getURL("personal-
> schema.xml").toString());
> >  
> > Is this a bug?
> >  
> > Thanks!
> > Carla

Benson Margulies | 17 Mar 00:33 2011
Picon

Stumped with the seemingly simplest possible use of anyAttribute

Xerces 2.9.1, and everything else I've tried, rejects an attribute
that I'm trying to permit with xs:anyAttribute. it seems just about as
simple of an application as possible, so I imagine that I'm missing
something pretty silly.

A very simple schema:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
            xmlns:xml="http://w   ww.w3.org/XML/1998/namespace"
           xmlns:bt="http://www.basistech.com/2010/btml/"
           targetNamespace="http://www.basistech.com/2010/btml/">
	<xs:element name="html-attributes">
	   <xs:complexType>
	       <xs:attribute name="blather" type="xs:string"/>
	       <xs:anyAttribute/>
	   </xs:complexType>
	</xs:element>
</xs:schema>

A very simple document:

<?xml version="1.0"?>
<html-attributes xmlns="http://www.basistech.com/2010/btml/"
         xmlns:q="http:/q/"
		 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
		 xsi:schemaLocation="http://www.basistech.com/2010/btml/ huh.xsd"
		 blather="Blither"
     />
Benson Margulies | 17 Mar 00:34 2011
Picon

Re: Stumped with the seemingly simplest possible use of anyAttribute

Oops, I sent the wrong version of the schema below. Remove the
<xs:attribute name='blather'/> of course.

On Wed, Mar 16, 2011 at 7:33 PM, Benson Margulies <bimargulies <at> gmail.com> wrote:
> Xerces 2.9.1, and everything else I've tried, rejects an attribute
> that I'm trying to permit with xs:anyAttribute. it seems just about as
> simple of an application as possible, so I imagine that I'm missing
> something pretty silly.
>
> A very simple schema:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
>            xmlns:xml="http://w   ww.w3.org/XML/1998/namespace"
>           xmlns:bt="http://www.basistech.com/2010/btml/"
>           targetNamespace="http://www.basistech.com/2010/btml/">
>        <xs:element name="html-attributes">
>           <xs:complexType>
>               <xs:attribute name="blather" type="xs:string"/>
>               <xs:anyAttribute/>
>           </xs:complexType>
>        </xs:element>
> </xs:schema>
>
> A very simple document:
>
> <?xml version="1.0"?>
> <html-attributes xmlns="http://www.basistech.com/2010/btml/"
>         xmlns:q="http:/q/"
>                 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>                 xsi:schemaLocation="http://www.basistech.com/2010/btml/ huh.xsd"
>                 blather="Blither"
>     />
>
Michael Glavassevich | 17 Mar 02:20 2011
Picon

Re: Stumped with the seemingly simplest possible use of anyAttribute

Benson,

It's a default value that's biting you. processContent="strict" [1] when you don't specify it. Your document would only be valid if your schema contained a global attribute declaration for "blather" but you haven't declared one.

Try <xs:anyAttribute processContents="lax"/> or <xs:anyAttribute processContents="skip"/> if you're not expecting or requiring the attributes in your instance document to be declared.

Thanks.

[1] http://www.w3.org/TR/xmlschema-1/#element-anyAttribute

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas <at> ca.ibm.com
E-mail: mrglavas <at> apache.org

Benson Margulies <bimargulies <at> gmail.com> wrote on 03/16/2011 07:34:41 PM:

> Oops, I sent the wrong version of the schema below. Remove the
> <xs:attribute name='blather'/> of course.
>
> On Wed, Mar 16, 2011 at 7:33 PM, Benson Margulies
> <bimargulies <at> gmail.com> wrote:
> > Xerces 2.9.1, and everything else I've tried, rejects an attribute
> > that I'm trying to permit with xs:anyAttribute. it seems just about as
> > simple of an application as possible, so I imagine that I'm missing
> > something pretty silly.
> >
> > A very simple schema:
> >
> > <?xml version="1.0" encoding="UTF-8"?>
> > <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
> >            xmlns:xml="http://w   ww.w3.org/XML/1998/namespace"
> >           xmlns:bt="http://www.basistech.com/2010/btml/"
> >           targetNamespace="http://www.basistech.com/2010/btml/">
> >        <xs:element name="html-attributes">
> >           <xs:complexType>
> >               <xs:attribute name="blather" type="xs:string"/>
> >               <xs:anyAttribute/>
> >           </xs:complexType>
> >        </xs:element>
> > </xs:schema>
> >
> > A very simple document:
> >
> > <?xml version="1.0"?>
> > <html-attributes xmlns="http://www.basistech.com/2010/btml/"
> >         xmlns:q="http:/q/"
> >                 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> >                 xsi:schemaLocation="http://www.basistech.com/2010/btml/
> huh.xsd"
> >                 blather="Blither"
> >     />
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-users-unsubscribe <at> xerces.apache.org
> For additional commands, e-mail: j-users-help <at> xerces.apache.org

Benson Margulies | 17 Mar 21:38 2011
Picon

Re: Stumped with the seemingly simplest possible use of anyAttribute

Thanks.

On Wed, Mar 16, 2011 at 9:20 PM, Michael Glavassevich
<mrglavas <at> ca.ibm.com> wrote:
> Benson,
>
> It's a default value that's biting you. processContent="strict" [1] when you
> don't specify it. Your document would only be valid if your schema contained
> a global attribute declaration for "blather" but you haven't declared one.
>
> Try <xs:anyAttribute processContents="lax"/> or <xs:anyAttribute
> processContents="skip"/> if you're not expecting or requiring the attributes
> in your instance document to be declared.
>
> Thanks.
>
> [1] http://www.w3.org/TR/xmlschema-1/#element-anyAttribute
>
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrglavas <at> ca.ibm.com
> E-mail: mrglavas <at> apache.org
>
> Benson Margulies <bimargulies <at> gmail.com> wrote on 03/16/2011 07:34:41 PM:
>
>> Oops, I sent the wrong version of the schema below. Remove the
>> <xs:attribute name='blather'/> of course.
>>
>> On Wed, Mar 16, 2011 at 7:33 PM, Benson Margulies
>> <bimargulies <at> gmail.com> wrote:
>> > Xerces 2.9.1, and everything else I've tried, rejects an attribute
>> > that I'm trying to permit with xs:anyAttribute. it seems just about as
>> > simple of an application as possible, so I imagine that I'm missing
>> > something pretty silly.
>> >
>> > A very simple schema:
>> >
>> > <?xml version="1.0" encoding="UTF-8"?>
>> > <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
>> >            xmlns:xml="http://w   ww.w3.org/XML/1998/namespace"
>> >           xmlns:bt="http://www.basistech.com/2010/btml/"
>> >           targetNamespace="http://www.basistech.com/2010/btml/">
>> >        <xs:element name="html-attributes">
>> >           <xs:complexType>
>> >               <xs:attribute name="blather" type="xs:string"/>
>> >               <xs:anyAttribute/>
>> >           </xs:complexType>
>> >        </xs:element>
>> > </xs:schema>
>> >
>> > A very simple document:
>> >
>> > <?xml version="1.0"?>
>> > <html-attributes xmlns="http://www.basistech.com/2010/btml/"
>> >         xmlns:q="http:/q/"
>> >                 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>> >                 xsi:schemaLocation="http://www.basistech.com/2010/btml/
>> huh.xsd"
>> >                 blather="Blither"
>> >     />
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: j-users-unsubscribe <at> xerces.apache.org
>> For additional commands, e-mail: j-users-help <at> xerces.apache.org
RJ | 21 Mar 21:03 2011

comparing xerces xml api and apache xmlschema 20 api

I am new to Xerces XMlSchemas project. Could you please point me to a  tutorial 
which explains how to use the Xerces XMLSchemas API.
http://xerces.apache.org/xerces2-j/javadocs/xs/index.html

I am going to be parsing XSD documents.

I was comparing Xerces XMlSchemas project and Apache XMLSchema 2.0 project 
http://ws.apache.org/commons/xmlschema20/

The Xerces XMlSchemas project supports XMLSchema 1.1, therefor i was leaning 
towards using it.

The Apache XMLSchema only support XMLSchema 1.0.

I would really appreciate comments/views on which would be a better project?

Gmane