Jack Bush | 2 Jan 2009 14:53
Picon

JDOM XSLT TransformerConfigurationException

Hi All,

I am getting the following exception when trying to do simple transformation (newbie in JDOM XSLT) using either XSLTransformer/TrAX in JDOM:

 

javax.xml.transform.TransformerConfigurationException: java.io.EOFException: no more input

at com.icl.saxon.PreparedStyleSheet.prepare(PreparedStyleSheet.java:121)

at com.icl.saxon.TransformerFactoryImpl.newTemplates(TransformerFactoryImpl.java:120)

at com.icl.saxon.TransformerFactoryImpl.newTransformer(TransformerFactoryImpl.java:72)

at standaloneClientRemoteInvestmentBean.JDOMTrAXPojoInvestmentBean.retrieveAreaZipcode(JDOMTrAXPojoInvestmentBean.java:68)

at standaloneClientRemoteInvestmentBean.JDOMTrAXPojoInvestmentBean.main(JDOMTrAXPojoInvestmentBean.java:37)

Caused by: java.io..EOFException: no more input

at com.icl.saxon.aelfred.XmlParser.popInput(XmlParser.java:4083)

at com.icl.saxon.aelfred.XmlParser.pushURL(XmlParser.java:3620)

at com.icl.saxon.aelfred.XmlParser.doParse(XmlParser.java:159)

at com.icl.saxon.aelfred.SAXDriver.parse(SAXDriver.java:320)

at com.icl.saxon.om.Builder.build(Builder.java:265)

at com.icl.saxon.PreparedStyleSheet.prepare(PreparedStyleSheet..java:111)

... 4 more

 

Below is the stateStyleSheet:

<?xml version="1.0" encoding="ISO-8859-1"?>

<xsl:stylesheet version="1.0"

xmlns:xsl="http://www..w3.org/1999/XSL/Transform">

<xsl:template match="/">

<html>

<body>

<h2>Transformed State Detail</h2>

<table border="1">

<tr bgcolor="lightblue">

<th align="left">Area Link</th>

<th align="left">Area Name</th>

</tr>

<xsl:for-each select="/ns:html/ns:body/ns:div[ <at> id='content']/ns:table[ <at> class='sresults']/ns:tr/ns:td/ns:a">

<tr>

<td><xsl:value-of select=" <at> href"/></td>

<td><xsl:value-of select=" <at> title"/></td>

</tr>

</xsl:for-each>

</table>

</body>

</html>

</xsl:template>

</xsl:stylesheet>

 

The Java program that calls this stateStyleSheet is as follows:

    SAXBuilder statesaxBuilder = new SAXBuilder("org.ccil.cowan.tagsoup.Parser", false);

    org.jdom.Document stateDocument = statesaxBuilder.build("state.xml");

    TransformerFactory factory = TransformerFactory.newInstance();

    Transformer transformer = factory.newTransformer(new StreamSource("stateStyleSheet.xsl"));

    JDOMSource source = new JDOMSource(stateDocument);

    JDOMResult result = new JDOMResult();

    transformer.transform(source, result);

    Document tranformedDocument = result.getDocument();

    ......

 

Could this exception have been caused by incorrectly formatted stateStyleSheet? The search path "/ns:html/ns:body/ns:div[ <at> id='content']/ns:table[ <at> class='sresults']/ns:tr/ns:td/ns:a" has successfully worked in XPath such as the following lines:

 

    XPath statePath = XPath.newInstance("/ns:html/ns:body/ns:div[ <at> id='content']/ns:table [ <at> class='sresults']/ns:tr/ns:td/ns:a");

    statePath.addNamespace("ns", http://www.w3.org/1999/xhtml);

 

Whether including namespace ("ns") or not doesn't make any difference.

I am running JDK1.6, Netbeans 6.1, JDOM 1.1, Saxon 6.5, TagSoup 1.2 on Windows XP platform.

Your assistance would be much appreciated.

Many thanks,

Jack


Stay connected to the people that matter most with a smarter inbox. Take a look.

Stay connected to the people that matter most with a smarter inbox. Take a look.
Michael Kay | 2 Jan 2009 15:02
Favicon
Gravatar

RE: JDOM XSLT TransformerConfigurationException

I would suspect that
 
new StreamSource("stateStyleSheet.xsl")
 
isn't finding the stylesheet.
 
Try supplying an absolute URI, or using the constructor
 
new StreamSource(new File("stateStyleSheet.xsl"))
 
Michael Kay
http://www.saxonica.com/

From: Jack Bush [mailto:netbeansfan <at> yahoo.com.au]
Sent: 02 January 2009 13:54
To: xml-dev <at> lists.xml.org
Subject: [xml-dev] JDOM XSLT TransformerConfigurationException

Hi All,

I am getting the following exception when trying to do simple transformation (newbie in JDOM XSLT) using either XSLTransformer/TrAX in JDOM:

 

javax.xml.transform.TransformerConfigurationException: java.io.EOFException: no more input

at com.icl.saxon.PreparedStyleSheet.prepare(PreparedStyleSheet.java:121)

at com.icl.saxon.TransformerFactoryImpl.newTemplates(TransformerFactoryImpl.java:120)

at com.icl.saxon.TransformerFactoryImpl.newTransformer(TransformerFactoryImpl.java:72)

at standaloneClientRemoteInvestmentBean.JDOMTrAXPojoInvestmentBean.retrieveAreaZipcode(JDOMTrAXPojoInvestmentBean.java:68)

at standaloneClientRemoteInvestmentBean.JDOMTrAXPojoInvestmentBean.main(JDOMTrAXPojoInvestmentBean.java:37)

Caused by: java.io..EOFException: no more input

at com.icl.saxon.aelfred.XmlParser.popInput(XmlParser.java:4083)

at com.icl.saxon.aelfred.XmlParser.pushURL(XmlParser.java:3620)

at com.icl.saxon.aelfred.XmlParser.doParse(XmlParser.java:159)

at com.icl.saxon.aelfred.SAXDriver.parse(SAXDriver.java:320)

at com.icl.saxon.om.Builder.build(Builder.java:265)

at com.icl.saxon.PreparedStyleSheet.prepare(PreparedStyleSheet..java:111)

... 4 more

 

Below is the stateStyleSheet:

<?xml version="1.0" encoding="ISO-8859-1"?>

<xsl:stylesheet version="1.0"

xmlns:xsl="http://www..w3.org/1999/XSL/Transform">

<xsl:template match="/">

<html>

<body>

<h2>Transformed State Detail</h2>

<table border="1">

<tr bgcolor="lightblue">

<th align="left">Area Link</th>

<th align="left">Area Name</th>

</tr>

<xsl:for-each select="/ns:html/ns:body/ns:div[ <at> id='content']/ns:table[ <at> class='sresults']/ns:tr/ns:td/ns:a">

<tr>

<td><xsl:value-of select=" <at> href"/></td>

<td><xsl:value-of select=" <at> title"/></td>

</tr>

</xsl:for-each>

</table>

</body>

</html>

</xsl:template>

</xsl:stylesheet>

 

The Java program that calls this stateStyleSheet is as follows:

    SAXBuilder statesaxBuilder = new SAXBuilder("org.ccil.cowan.tagsoup.Parser", false);

    org.jdom.Document stateDocument = statesaxBuilder.build("state.xml");

    TransformerFactory factory = TransformerFactory.newInstance();

    Transformer transformer = factory.newTransformer(new StreamSource("stateStyleSheet.xsl"));

    JDOMSource source = new JDOMSource(stateDocument);

    JDOMResult result = new JDOMResult();

    transformer.transform(source, result);

    Document tranformedDocument = result.getDocument();

    ......

 

Could this exception have been caused by incorrectly formatted stateStyleSheet? The search path "/ns:html/ns:body/ns:div[ <at> id='content']/ns:table[ <at> class='sresults']/ns:tr/ns:td/ns:a" has successfully worked in XPath such as the following lines:

 

    XPath statePath = XPath.newInstance("/ns:html/ns:body/ns:div[ <at> id='content']/ns:table [ <at> class='sresults']/ns:tr/ns:td/ns:a");

    statePath.addNamespace("ns", http://www.w3.org/1999/xhtml);

 

Whether including namespace ("ns") or not doesn't make any difference.

I am running JDK1.6, Netbeans 6.1, JDOM 1.1, Saxon 6.5, TagSoup 1.2 on Windows XP platform.

Your assistance would be much appreciated.

Many thanks,

Jack


Stay connected to the people that matter most with a smarter inbox. Take a look.

Stay connected to the people that matter most with a smarter inbox. Take a look.
Robert Koberg | 2 Jan 2009 15:11

Re: JDOM XSLT TransformerConfigurationException

Hi,

The ns namespace prefix isn't defined anywhere.

best

On Jan 2, 2009, at 8:53 AM, Jack Bush wrote:

> Hi All,
> I am getting the following exception when trying to do simple  
> transformation (newbie in JDOM XSLT) using either XSLTransformer/ 
> TrAX in JDOM:
>
>
> javax.xml.transform.TransformerConfigurationException:  
> java.io.EOFException: no more input
>
> at com.icl.saxon.PreparedStyleSheet.prepare(PreparedStyleSheet.java: 
> 121)
>
> at  
> com 
> .icl 
> .saxon 
> .TransformerFactoryImpl.newTemplates(TransformerFactoryImpl.java:120)
>
> at  
> com 
> .icl 
> .saxon 
> .TransformerFactoryImpl.newTransformer(TransformerFactoryImpl.java:72)
>
> at  
> standaloneClientRemoteInvestmentBean 
> .JDOMTrAXPojoInvestmentBean 
> .retrieveAreaZipcode(JDOMTrAXPojoInvestmentBean.java:68)
>
> at  
> standaloneClientRemoteInvestmentBean 
> .JDOMTrAXPojoInvestmentBean.main(JDOMTrAXPojoInvestmentBean.java:37)
>
> Caused by: java.io..EOFException: no more input
>
> at com.icl.saxon.aelfred.XmlParser.popInput(XmlParser.java:4083)
>
> at com.icl.saxon.aelfred.XmlParser.pushURL(XmlParser.java:3620)
>
> at com.icl.saxon.aelfred.XmlParser.doParse(XmlParser.java:159)
>
> at com.icl.saxon.aelfred.SAXDriver.parse(SAXDriver.java:320)
>
> at com.icl.saxon.om.Builder.build(Builder.java:265)
>
> at com.icl.saxon.PreparedStyleSheet.prepare(PreparedStyleSheet..java: 
> 111)
>
> ... 4 more
>
>
> Below is the stateStyleSheet:
>
> <?xml version="1.0" encoding="ISO-8859-1"?>
>
> <xsl:stylesheet version="1.0"
>
> xmlns:xsl="http://www..w3.org/1999/XSL/Transform">
>
>
> <xsl:template match="/">
>
> <html>
>
> <body>
>
> <h2>Transformed State Detail</h2>
>
> <table border="1">
>
> <tr bgcolor="lightblue">
>
> <th align="left">Area Link</th>
>
> <th align="left">Area Name</th>
>
> </tr>
>
> <xsl:for-each select="/ns:html/ns:body/ns:div[ <at> id='content']/ 
> ns:table[ <at> class='sresults']/ns:tr/ns:td/ns:a">
>
> <tr>
>
> <td><xsl:value-of select=" <at> href"/></td>
>
> <td><xsl:value-of select=" <at> title"/></td>
>
> </tr>
>
> </xsl:for-each>
>
> </table>
>
> </body>
>
> </html>
>
> </xsl:template>
>
> </xsl:stylesheet>
>
>
> The Java program that calls this stateStyleSheet is as follows:
>
>     SAXBuilder statesaxBuilder = new  
> SAXBuilder("org.ccil.cowan.tagsoup.Parser", false);
>
>     org.jdom.Document stateDocument =  
> statesaxBuilder.build("state.xml");
>
>     TransformerFactory factory = TransformerFactory.newInstance();
>
>     Transformer transformer = factory.newTransformer(new  
> StreamSource("stateStyleSheet.xsl"));
>
>     JDOMSource source = new JDOMSource(stateDocument);
>
>     JDOMResult result = new JDOMResult();
>
>     transformer.transform(source, result);
>
>     Document tranformedDocument = result.getDocument();
>
>     ......
>
>
> Could this exception have been caused by incorrectly formatted  
> stateStyleSheet? The search path "/ns:html/ns:body/ 
> ns:div[ <at> id='content']/ns:table[ <at> class='sresults']/ns:tr/ns:td/ns:a"  
> has successfully worked in XPath such as the following lines:
>
>
>     XPath statePath = XPath.newInstance("/ns:html/ns:body/ 
> ns:div[ <at> id='content']/ns:table [ <at> class='sresults']/ns:tr/ns:td/ns:a");
>
>     statePath.addNamespace("ns", http://www.w3.org/1999/xhtml);
>
>
> Whether including namespace ("ns") or not doesn't make any difference.
>
> I am running JDK1.6, Netbeans 6.1, JDOM 1.1, Saxon 6.5, TagSoup 1.2  
> on Windows XP platform.
>
> Your assistance would be much appreciated.
>
> Many thanks,
>
> Jack
>
>
>
> Stay connected to the people that matter most with a smarter inbox.  
> Take a look.
>
> Stay connected to the people that matter most with a smarter inbox.  
> Take a look.

_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe <at> lists.xml.org
subscribe: xml-dev-subscribe <at> lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php

Michael Kay | 2 Jan 2009 16:05
Favicon
Gravatar

RE: JDOM XSLT TransformerConfigurationException

> Hi,
> 
> The ns namespace prefix isn't defined anywhere.
> 

Good point, but that wouldn't cause AElfred to choke on the input.

Michael Kay
http://www.saxonica.com/

_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe <at> lists.xml.org
subscribe: xml-dev-subscribe <at> lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php

Jack Bush | 4 Jan 2009 11:57
Picon

Re: JDOM XSLT TransformerConfigurationException

Hi Michael and Robert,
 
Thanks for reponding to this question.
 
You are right about it my Java application is having diffulty reading stateStyleSheet.xsl. I have overcame it by moving it to a new project and suspects that the order of jar files in CLASSPATH were in the wrong order. Nevertheless, I now encountered another issue this time:
 
java.io.UTFDataFormatException: Invalid byte 1 of 1-byte UTF-8 sequence.
        at org.apache.xerces.impl.io.UTF8Reader.invalidByte(Unknown Source)
        at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source)
        at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
        at org.apache.xerces.impl.XMLEntityScanner.skipChar(Unknown Source)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
        at org.apache..xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces..parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
        at org.jdom.input.SAXBuilder.build(SAXBuilder.java:489)
        at org.jdom.input.SAXBuilder.build(SAXBuilder.java:928)
        at JDOMTrAXPojoInvestmentBean.main(JDOMTrAXPojoInvestmentBean.java:45)
The header of state.xml is as follows:
 
  <?xml version="1.0" encoding="UTF-8" ?>
  <!DOCTYPE html (View Source for full doctype...)>
- <html xmlns="http://www.w3.org/1999/xhtml" xmlns:html="http://www.w3.org/1999/xhtml">

Any ideas on what is the cause of this issue and how to overcome it? Likewise, how to define the correct proper namespace prefix? Is it possible that this document has two namespaces. A default one and one with prefix 'html'? If so, which one should I use?
 
Many thanks again,
 
Jack
From: Michael Kay <mike <at> saxonica.com>
To: Jack Bush <netbeansfan <at> yahoo.com.au>; xml-dev <at> lists.xml.org
Sent: Saturday, 3 January, 2009 1:02:10 AM
Subject: RE: [xml-dev] JDOM XSLT TransformerConfigurationException

DIV { MARGIN:0px;}
I would suspect that
 
new StreamSource("stateStyleSheet.xsl")
 
isn't finding the stylesheet.
 
Try supplying an absolute URI, or using the constructor
 
new StreamSource(new File("stateStyleSheet.xsl"))
 
Michael Kay
http://www.saxonica.com/

From: Jack Bush [mailto:netbeansfan <at> yahoo.com.au]
Sent: 02 January 2009 13:54
To: xml-dev <at> lists.xml.org
Subject: [xml-dev] JDOM XSLT TransformerConfigurationException

Hi All,

I am getting the following exception when trying to do simple transformation (newbie in JDOM XSLT) using either XSLTransformer/TrAX in JDOM:

 

javax.xml.transform.TransformerConfigurationException: java.io.EOFException: no more input

at com.icl.saxon.PreparedStyleSheet.prepare(PreparedStyleSheet.java:121)

at com.icl.saxon.TransformerFactoryImpl.newTemplates(TransformerFactoryImpl.java:120)

at com.icl.saxon.TransformerFactoryImpl..newTransformer(TransformerFactoryImpl.java:72)

at standaloneClientRemoteInvestmentBean.JDOMTrAXPojoInvestmentBean.retrieveAreaZipcode(JDOMTrAXPojoInvestmentBean.java:68)

at standaloneClientRemoteInvestmentBean.JDOMTrAXPojoInvestmentBean.main(JDOMTrAXPojoInvestmentBean.java:37)

Caused by: java.io..EOFException: no more input

at com.icl.saxon.aelfred.XmlParser.popInput(XmlParser.java:4083)

at com.icl.saxon.aelfred.XmlParser.pushURL(XmlParser.java:3620)

at com.icl.saxon.aelfred.XmlParser.doParse(XmlParser.java:159)

at com.icl.saxon.aelfred.SAXDriver.parse(SAXDriver.java:320)

at com.icl.saxon.om.Builder.build(Builder.java:265)

at com.icl.saxon.PreparedStyleSheet.prepare(PreparedStyleSheet..java:111)

... 4 more

 

Below is the stateStyleSheet:

<?xml version="1.0" encoding="ISO-8859-1"?>

<xsl:stylesheet version="1.0"

xmlns:xsl="http://www..w3.org/1999/XSL/Transform">

<xsl:template match="/">

<html>

<body>

<h2>Transformed State Detail</h2>

<table border="1">

<tr bgcolor="lightblue">

<th align="left">Area Link</th>

<th align="left">Area Name</th>

</tr>

<xsl:for-each select="/ns:html/ns:body/ns:div[ <at> id='content']/ns:table[ <at> class='sresults']/ns:tr/ns:td/ns:a">

<tr>

<td><xsl:value-of select=" <at> href"/></td>

<td><xsl:value-of select=" <at> title"/></td>

</tr>

</xsl:for-each>

</table>

</body>

</html>

</xsl:template>

</xsl:stylesheet>

 

The Java program that calls this stateStyleSheet is as follows:

    SAXBuilder statesaxBuilder = new SAXBuilder("org.ccil.cowan.tagsoup.Parser", false);

    org.jdom.Document stateDocument = statesaxBuilder.build("state.xml");

    TransformerFactory factory = TransformerFactory.newInstance();

    Transformer transformer = factory.newTransformer(new StreamSource("stateStyleSheet.xsl"));

    JDOMSource source = new JDOMSource(stateDocument);

    JDOMResult result = new JDOMResult();

    transformer.transform(source, result);

    Document tranformedDocument = result.getDocument();

    ......

 

Could this exception have been caused by incorrectly formatted stateStyleSheet? The search path "/ns:html/ns:body/ns:div[ <at> id='content']/ns:table[ <at> class='sresults']/ns:tr/ns:td/ns:a" has successfully worked in XPath such as the following lines:

 

    XPath statePath = XPath.newInstance("/ns:html/ns:body/ns:div[ <at> id='content']/ns:table [ <at> class='sresults']/ns:tr/ns:td/ns:a");

    statePath.addNamespace("ns", http://www.w3.org/1999/xhtml);

 

Whether including namespace ("ns") or not doesn't make any difference.

I am running JDK1.6, Netbeans 6.1, JDOM 1.1, Saxon 6.5, TagSoup 1.2 on Windows XP platform.

Your assistance would be much appreciated.

Many thanks,

Jack


Stay connected to the people that matter most with a smarter inbox. Take a look.

Stay connected to the people that matter most with a smarter inbox. Take a look.

Stay connected to the people that matter most with a smarter inbox. Take a look.
Michael Kay | 4 Jan 2009 16:13
Favicon
Gravatar

RE: JDOM XSLT TransformerConfigurationException

Nevertheless, I now encountered another issue this time:
 
java.io.UTFDataFormatException: Invalid byte 1 of 1-byte UTF-8 sequence. 
 
There's only one explanation of that: the parser is expecting the document to be encoded in UTF-8 but it isn't. To understand why it isn't, you need to examine how the document was created and any transcodings that might have taken place before it reached the parser.
 
 
        at org.apache.xerces.impl.io.UTF8Reader.invalidByte(Unknown Source)
        at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source)
        at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
        at org.apache.xerces.impl.XMLEntityScanner.skipChar(Unknown Source)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
        at org.apache..xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces..parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
        at org.jdom.input.SAXBuilder.build(SAXBuilder.java:489)
        at org.jdom.input.SAXBuilder.build(SAXBuilder.java:928)
        at JDOMTrAXPojoInvestmentBean.main(JDOMTrAXPojoInvestmentBean.java:45)
The header of state.xml is as follows:
 
  <?xml version="1.0" encoding="UTF-8" ?>
  <!DOCTYPE html (View Source for full doctype...)>
- <html xmlns="http://www.w3.org/1999/xhtml" xmlns:html="http://www.w3.org/1999/xhtml">

Any ideas on what is the cause of this issue and how to overcome it? Likewise, how to define the correct proper namespace prefix? Is it possible that this document has two namespaces. A default one and one with prefix 'html'? If so, which one should I use?
 
 It's certainly inelegant to bind the same namespace to two prefixes like this, though it's not incorrect. Again to prevent it happening we need to understand how you created the document.
 
Michael Kay 
Jack Bush | 5 Jan 2009 03:37
Picon

Re: JDOM XSLT TransformerConfigurationException

Hi Michael,

 

The following statements generated state.xml file:

 

URL stateUrl = new URL("http://www.abc.com");

URLConnection stateconnection = stateUrl.openConnection();

stateisInHtml = stateconnection.getInputStream();

statedisInHtml = new DataInputStream(new BufferedInputStream(stateisInHtml));

System.out.flush();

statefosOutHtml = new FileOutputStream("state.html");

while ((oneChar=statedisInHtml.read()) != -1)

statefosOutHtml.write(oneChar);

.....

 

statefrInHtml = new FileReader("state.html");

statebrInHtml = new BufferedReader(statefrInHtml);

SAXBuilder statesaxBuilder = new SAXBuilder("org.ccil.cowan.tagsoup..Parser", false);

org.jdom.Document statejdomDocument = statesaxBuilder.build(statebrInHtml);

XMLOutputter stateoutputter = new XMLOutputter();

statefwOutXml = new FileWriter("state.xml");

statebwOutXml = new BufferedWriter(statefwOutXml);

stateoutputter.output(statejdomDocument, statebwOutXml);

 

XPath had no problem looking up state.xml.

 

Thanks,

Jack

 

From: Michael Kay <mike <at> saxonica.com>
To: Jack Bush <netbeansfan <at> yahoo.com.au>; Robert Koberg <rob <at> koberg.com>
Cc: xml-dev <at> lists.xml.org
Sent: Monday, 5 January, 2009 2:13:33 AM
Subject: RE: [xml-dev] JDOM XSLT TransformerConfigurationException

DIV { MARGIN:0px;}
Nevertheless, I now encountered another issue this time:
 
java.io.UTFDataFormatException: Invalid byte 1 of 1-byte UTF-8 sequence. 
 
There's only one explanation of that: the parser is expecting the document to be encoded in UTF-8 but it isn't. To understand why it isn't, you need to examine how the document was created and any transcodings that might have taken place before it reached the parser.
 
 
        at org.apache.xerces.impl.io.UTF8Reader.invalidByte(Unknown Source)
        at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source)
        at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
        at org.apache.xerces.impl.XMLEntityScanner.skipChar(Unknown Source)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
        at org.apache..xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces..parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
        at org.jdom.input.SAXBuilder.build(SAXBuilder.java:489)
        at org.jdom..input.SAXBuilder.build(SAXBuilder.java:928)
        at JDOMTrAXPojoInvestmentBean.main(JDOMTrAXPojoInvestmentBean.java:45)
The header of state.xml is as follows:
 
  <?xml version="1.0" encoding="UTF-8" ?>
  <!DOCTYPE html (View Source for full doctype...)>
- <html xmlns="http://www.w3.org/1999/xhtml" xmlns:html="http://www.w3.org/1999/xhtml">

Any ideas on what is the cause of this issue and how to overcome it? Likewise, how to define the correct proper namespace prefix? Is it possible that this document has two namespaces. A default one and one with prefix 'html'? If so, which one should I use?
 
 It's certainly inelegant to bind the same namespace to two prefixes like this, though it's not incorrect. Again to prevent it happening we need to understand how you created the document.
 
Michael Kay 

Stay connected to the people that matter most with a smarter inbox. Take a look.
Michael Kay | 5 Jan 2009 12:19
Favicon
Gravatar

RE: JDOM XSLT TransformerConfigurationException

Well, for some reason it looks as if you are trying to parse using TagSoup but the stack trace shows you are actually parsing using Xerces.
 
Michael Kay

From: Jack Bush [mailto:netbeansfan <at> yahoo.com.au]
Sent: 05 January 2009 02:38
To: Michael Kay; Robert Koberg
Cc: xml-dev <at> lists.xml.org
Subject: Re: [xml-dev] JDOM XSLT TransformerConfigurationException

Hi Michael,

 

The following statements generated state.xml file:

 

URL stateUrl = new URL("http://www.abc.com");

URLConnection stateconnection = stateUrl.openConnection();

stateisInHtml = stateconnection.getInputStream();

statedisInHtml = new DataInputStream(new BufferedInputStream(stateisInHtml));

System.out.flush();

statefosOutHtml = new FileOutputStream("state.html");

while ((oneChar=statedisInHtml.read()) != -1)

statefosOutHtml.write(oneChar);

.....

 

statefrInHtml = new FileReader("state.html");

statebrInHtml = new BufferedReader(statefrInHtml);

SAXBuilder statesaxBuilder = new SAXBuilder("org.ccil.cowan.tagsoup..Parser", false);

org.jdom.Document statejdomDocument = statesaxBuilder.build(statebrInHtml);

XMLOutputter stateoutputter = new XMLOutputter();

statefwOutXml = new FileWriter("state.xml");

statebwOutXml = new BufferedWriter(statefwOutXml);

stateoutputter.output(statejdomDocument, statebwOutXml);

 

XPath had no problem looking up state.xml.

 

Thanks,

Jack

 

From: Michael Kay <mike <at> saxonica.com>
To: Jack Bush <netbeansfan <at> yahoo.com.au>; Robert Koberg <rob <at> koberg.com>
Cc: xml-dev <at> lists.xml.org
Sent: Monday, 5 January, 2009 2:13:33 AM
Subject: RE: [xml-dev] JDOM XSLT TransformerConfigurationException

DIV { MARGIN: 0px }
Nevertheless, I now encountered another issue this time:
 
java.io.UTFDataFormatException: Invalid byte 1 of 1-byte UTF-8 sequence. 
 
There's only one explanation of that: the parser is expecting the document to be encoded in UTF-8 but it isn't. To understand why it isn't, you need to examine how the document was created and any transcodings that might have taken place before it reached the parser.
 

        at org.apache.xerces.impl.io.UTF8Reader.invalidByte(Unknown Source)
        at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source)
        at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
        at org.apache.xerces.impl.XMLEntityScanner.skipChar(Unknown Source)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
        at org.apache..xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces..parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
        at org.jdom.input.SAXBuilder.build(SAXBuilder.java:489)
        at org.jdom..input.SAXBuilder.build(SAXBuilder.java:928)
        at JDOMTrAXPojoInvestmentBean.main(JDOMTrAXPojoInvestmentBean.java:45)
The header of state.xml is as follows:
 
  <?xml version="1.0" encoding="UTF-8" ?>
  <!DOCTYPE html (View Source for full doctype...)>
- <html xmlns="http://www.w3.org/1999/xhtml" xmlns:html="http://www.w3.org/1999/xhtml">

Any ideas on what is the cause of this issue and how to overcome it? Likewise, how to define the correct proper namespace prefix? Is it possible that this document has two namespaces. A default one and one with prefix 'html'? If so, which one should I use?
 
 It's certainly inelegant to bind the same namespace to two prefixes like this, though it's not incorrect. Again to prevent it happening we need to understand how you created the document.
 
Michael Kay 

Stay connected to the people that matter most with a smarter inbox. Take a look.
Jirka Kosek | 5 Jan 2009 23:30
Picon
Favicon
Gravatar

Announce: XML Prague 2009 Last Call for Papers

XML Prague 2009 Last Call for Papers
------------------------------------

You have the last chance to submit your presentation to XML Prague 2009
(http://www.xmlprague.cz) as CFP ends this Friday. We are welcoming
submissions on the following topics:

   * XML Authoring
   * Data Modeling/Definition and Schema Languages
   * XML Vocabularies
   * Generating and Transforming XML
   * Markup Failures

All proposals will be submitted for review by a peer review panel made
up of the XML Prague Organizing Committee. Submissions will be chosen
based on interest, applicability, technical merit, and technical
correctness.

Accepted Papers will be included inside the published conference
proceedings.

Authors should strive to contain original material and belong in the
topics previously listed. Submissions which can be construed as
product or service descriptions (adverts) will probably be deemed
inappropriate. Other approaches such as using case studies are welcome
but must be clearly related to conference topics.

Selected presenters must submit an full paper (on time) and give their
presentation and answer questions in English, as well as follow the
XML Prague 2009 conference guidelines.
Important Dates:

   * January 9 - End of CFP (extended abstract or full paper)
   * January 23 - Notification of acceptance/rejection of paper to authors
   * February 8 - Final paper

How to Submit:

All submissions must be done using the conference management system
available at the https://cmt.research.microsoft.com/XML2009/

If you have never used this system before for other conferences, you
will have to sign-up first and create new account for yourself.

After logging onto conference management system you will be able to
submit your paper and edit this submission before deadline. You can
also review the status of your paper as well as review comments from
review process.

You may submit an abstract or a full paper. We recommend to provide as
much information as possible to give the reviewers the best chance to
judge your submission.

Submission Guidelines:

All submissions must be in English and paper should not exceed 15
pages in length. Papers for review can be submitted in any common
format. However, we remind people that if your paper is accepted that
we will be asking for the final version to be submitted in DocBook XML
format.

If you have any question regarding submission process please contact
Jirka Kosek at jirka.kosek <at> xmlprague.cz.

--

-- 
XML Prague
March 21-22, 2009
http://www.xmlprague.cz

Sponsored by Syntea (http://www.syntea.cz),
         and oXygen (http://www.oxygenxml.com)
         and fgeorges.org (http://www.fgeorges.org)

Cecil New | 6 Jan 2009 14:09
Picon

Re: Here's why it's not always a good idea to embed validation information (e.g., schemaLocation) in instance documents

Interesting read, thanks.  I did find one implementation (JNVDL)... are there others?  How new is this?  Likelihood of wide acceptance?


Gmane