Warren Smith | 6 Nov 22:07

<![CDATA[ ... ]]> and HTML Elements

I apologize if this gets posted twice...

Is there a fix to get the Petal parser to respect <![CDATA[ ... ]]> and
HTML Elements?

Petal seems hell-bent on removing any CDATA tags and parsing data inside
these supposedly "Protected Areas" and converting <,>,& and friends into
&amp;lt;,&amp;gt;,&amp;amp. I can't for the life of me figure this one
out.

Any help appreciated,

Warren

Warren Smith | 6 Nov 22:05

<![CDATA[ ... ]]> and HTML Elements

Is there a fix to get the Petal parser to respect <![CDATA[ ... ]]> and
HTML Elements?

Petal seems hell-bent on removing any CDATA tags and parsing data inside
these supposedly "Protected Areas" and converting <,>,& and friends into
&amp;lt;,&amp;gt;,&amp;amp. I can't for the life of me figure this one
out.

Any help appreciated,

Warren

Chris Croome | 8 Nov 13:50
Picon
Favicon

Re: <![CDATA[ ... ]]> and HTML Elements

Hi

On Sat 06-Nov-2004 at 03:07:08PM -0600, Warren Smith wrote:
> I apologize if this gets posted twice...

Your other address should be OK now :-)
> 
> Is there a fix to get the Petal parser to respect <![CDATA[ ... ]]> and
> HTML Elements?
> 
> Petal seems hell-bent on removing any CDATA tags and parsing data inside
> these supposedly "Protected Areas" and converting <,>,& and friends into
> &amp;lt;,&amp;gt;,&amp;amp. I can't for the life of me figure this one
> out.

This has come up before with respect to javascript, what should
petal do, not touch things in CDATA sections?

The only problem with this is if you want to use some variables in
a bit of embedded javascript.

Chris

--

-- 
Chris Croome                               <chris@...>
web design                             http://www.webarchitects.co.uk/ 
web content management                               http://mkdoc.com/   

Fergal Daly | 8 Nov 14:05
Picon

Re: <![CDATA[ ... ]]> and HTML Elements

On Sat, Nov 06, 2004 at 03:05:19PM -0600, Warren Smith wrote:
> Is there a fix to get the Petal parser to respect <![CDATA[ ... ]]> and
> HTML Elements?
> 
> Petal seems hell-bent on removing any CDATA tags and parsing data inside
> these supposedly "Protected Areas" and converting <,>,& and friends into
> &amp;lt;,&amp;gt;,&amp;amp. I can't for the life of me figure this one
> out.

Can you give an example of what goes in and what comes out. I think a CDATA
section is only "protected" in the sense that it's not necessary to escape
&s and <s so

<tag>[![CDATA[hello & good evening]]]</tag>

is equivalent to

<tag>hello &amp; good evening</tag>

At the end of the day a CDATA sections just makes it easy to insert a chunk
of text without having to worry about escaping it. They are for the
convenience of someone generating XML, they have no special meaning and the
parser has no obligation to tell the application that this text here was
originally a CDATA and this one wasn't,

http://www.w3c.org/TR/2004/REC-xml-20040204/#sec-cdata-sect

Fergal

(Continue reading)

William McKee | 8 Nov 17:16
Favicon

Re: <![CDATA[ ... ]]> and HTML Elements

On Mon, Nov 08, 2004 at 12:50:59PM +0000, Chris Croome wrote:
> This has come up before with respect to javascript, what should
> petal do, not touch things in CDATA sections?

I think this would be a good solution.

> The only problem with this is if you want to use some variables in
> a bit of embedded javascript.

If this is the case, put that code outside of the CDATA section.
Currently, I have to put most of my JavaScript into external files to
prevent interpolation by Petal. I only leave what is needed for variable
completion.

William

--

-- 
Knowmad Services Inc.
http://www.knowmad.com

Fergal Daly | 8 Nov 18:00
Picon

Re: <![CDATA[ ... ]]> and HTML Elements

On Mon, Nov 08, 2004 at 11:16:08AM -0500, William McKee wrote:
> On Mon, Nov 08, 2004 at 12:50:59PM +0000, Chris Croome wrote:
> > This has come up before with respect to javascript, what should
> > petal do, not touch things in CDATA sections?
> 
> I think this would be a good solution.

Petal's current behaviour is correct.

Text that comes from CDATA should not be treated any differently than normal
text. This XML doc

<tag><![CDATA[a & b]]></tag>

is exactly the same as this one

<tag>a &amp; b</tag>

They have different representations but they have the same meaning. If you
put the first one into this tool

http://soapclient.com/XMLCanon.html

it will output the second one.

Any application that treats one differently from the other is broken.

Here's a script that parses those 2 documents using XML::Parser and output's
the parsed tree. As you can see, the parsed tree for each one is identical.

(Continue reading)

William McKee | 8 Nov 19:41
Favicon

Re: <![CDATA[ ... ]]> and HTML Elements

On Mon, Nov 08, 2004 at 05:00:34PM +0000, Fergal Daly wrote:
> On Mon, Nov 08, 2004 at 11:16:08AM -0500, William McKee wrote:
> > On Mon, Nov 08, 2004 at 12:50:59PM +0000, Chris Croome wrote:
> > > This has come up before with respect to javascript, what should
> > > petal do, not touch things in CDATA sections?
> > 
> > I think this would be a good solution.
> 
> Petal's current behaviour is correct.

Fergal,

I have no doubt you are right regarding the behavior of Petal. However,
the correct behavior is not always the optimal behavior. Here's an
example of a JavaScript that will fail when passed through Petal:

  <script type="text/Javascript" language="Javascript">
    var d;
    d = '1' & '1';
    alert(d);
  </script>

The '&' which does a comparison in the same way that Perl's own && gets
converted to &amp;. Unfortunately, the browser does not treat this
entity as a comparison operator.

Any advice on how to make this work without using an external Javascript
file would be appreciated.

William
(Continue reading)

Fergal Daly | 8 Nov 22:00
Picon

Re: <![CDATA[ ... ]]> and HTML Elements

On Mon, Nov 08, 2004 at 01:41:40PM -0500, William McKee wrote:
> Fergal,
> 
> I have no doubt you are right regarding the behavior of Petal. However,
> the correct behavior is not always the optimal behavior. Here's an
> example of a JavaScript that will fail when passed through Petal:
> 
>   <script type="text/Javascript" language="Javascript">
>     var d;
>     d = '1' & '1';
>     alert(d);
>   </script>
> 
> The '&' which does a comparison in the same way that Perl's own && gets
> converted to &amp;. Unfortunately, the browser does not treat this
> entity as a comparison operator.
> 
> Any advice on how to make this work without using an external Javascript
> file would be appreciated.

That's definitely a problem but I don't see the connection with CDATA. I
just tried

<script type="text/Javascript" language="Javascript">
<![CDATA[
  var d;
  d = '1' & '1';
  alert(d);
]]>
</script>
(Continue reading)

William McKee | 8 Nov 22:13
Favicon

Re: <![CDATA[ ... ]]> and HTML Elements

On Mon, Nov 08, 2004 at 09:00:12PM +0000, Fergal Daly wrote:
> That's definitely a problem but I don't see the connection with CDATA. I
> just tried

If you comment out the CDATA sections, they will work (HTML Tidy even
does this for you when it adds CDATA sections to inline Javascript).

> <script type="text/Javascript" language="Javascript">
  //<![CDATA[
>   var d;
>   d = '1' & '1';
>   alert(d);
  //]]>
> </script>

> I generally do
> 
>  <snip>
> 
> Petal seems to leave that alone and it's the standard way of doing things
> even when Petal isn't involved

Ahhh, that's what Jean-Michel did to add better support for JavaScript.
I remember him telling the list that he had made a release with better
support. However, I'd since stopped using the // <!-- // --> comments
since they are mostly unnecessary.

Since that workaround is in place, it doesn't seem like it'd be hard to
add a similar workaround for CDATA which is gaining more usage, esp
since HTML Tidy wraps style tags and JavaScript sections in them. What
(Continue reading)

Chris Croome | 9 Nov 11:20
Picon
Favicon

Re: <![CDATA[ ... ]]> and HTML Elements

Hi

On Mon 08-Nov-2004 at 04:13:08PM -0500, William McKee wrote:
>
> I'd since stopped using the // <!-- // --> comments since they are
> mostly unnecessary.

They are potentially worse than unnecessary if you are producing
XHTML:

  C.4. Embedded Style Sheets and Scripts

  Use external scripts if your script uses < or & or ]]> or --.
  Note that XML parsers are permitted to silently remove the contents
  of comments. Therefore, the historical practice of "hiding" scripts
  and style sheets within "comments" to make the documents backward
  compatible is likely to not work as expected in XML-based user
  agents.

  http://www.w3.org/TR/xhtml1/#C_4

Chris

--

-- 
Chris Croome                               <chris@...>
web design                             http://www.webarchitects.co.uk/ 
web content management                               http://mkdoc.com/   


Gmane