Michael D. Crawford | 1 Jul 02:36 2002

Re: Free hosting services append invalid HTML ads


Friends,

Please see my new "Free Hosting Service HTML Validation Test Page" at:

http://linuxquality.sunsite.dk/articles/validationtest/

It is a valid XHTML 1.0 strict document that is meant to be copied to 
free website hosting services.  It explains to the hosting service why 
it is there, and has the "check referer" link in it so the hosting 
services can validate their local copies:

http://validator.w3.org/check/referer

For comparison, it also has a link that validates the original document 
off of its original location at the Linux Quality Database.  There is 
also a link to a list of the locations where other copies of these pages 
may be found at other hosting services.  So far there is just one copy 
of the test page at a free host:

http://free.hostdepartment.com/goingware/

Go to this last page and click the "check referer" link, and you will 
see that it doesn't validate because of the advertising markup added by 
the hosting service.

The page goes on to explain why it's important to generate valid markup, 
has some links to several validators and an article I wrote on how to 
use them, and finally instructions that any reader can follow on how to 
place copies of the page at other hosting services - along with an 
(Continue reading)

Tex Texin | 1 Jul 06:06 2002

revalidate this page, set encoding


The button to revalidate a page never works for me. Rather than
revalidate the page it goes back to the page for specifying a URL.
Usually I am loading a page from my machine so this isn't even the right
page to go to, I would rather it went to the screen for loading files.
Although this is a bug, it normally doesn't bother me.

However, I have a page using a user-defined encoding. It is actually
utf-8 but because it contains supplementary characters, some browsers
require it to be user-defined.
The page is http://www.i18nguy.com/unicode-example-plane1.html.

Because it is a user-defined encoding, I get an error. I would like to
revalidate and specify the encoding.
However, the file upload page does not let you specify an encoding. The
revalidate page does, but doesn't revalidate instead it goes back to the
initial screen for entering a url.

It would be good if the file upload page allowed specification of an
encoding.
thanks

--

-- 
-------------------------------------------------------------
Tex Texin   cell: +1 781 789 1898   mailto:Tex <at> XenCraft.com
Xen Master                          http://www.i18nGuy.com

XenCraft		            http://www.XenCraft.com
Making e-Business Work Around the World
-------------------------------------------------------------
(Continue reading)

Bob Rosenberg | 1 Jul 06:37 2002

Re: Free hosting services append invalid HTML ads


At 22:06 -0230 on 06/30/2002, Michael D. Crawford wrote about Re: 
Free hosting services append invalid HTML ads:

>I appreciate any suggestions you have for improvement

I would suggest a section explaining what the Hosting Service needs 
to do to allow their code inserted into the pages to pass validation. 
In this case they should LOOK at the DOCTYPE on the page and use it 
to SELECT which copy of their code should be inserted. This means 
that this Hosting Service needs one copy of their code for HTML 4.x 
and one for XHTML (it should not be THAT hard to convert the 
boilerplate into XHTML).

Looking at the code they insert, the "errors" are the missing 
Type="Text/javascript" on the Script Tag and the formatting of the 
tags. The latter is NOT an HTML error while I'm not sure how old a 
HTML level you need to claim before the existence of that parm would 
cause an error (IOW: Will one version do for HTML or will there need 
to be one copy as they now have and one that has the TYPE clause).
--

-- 
--

Bob Rosenberg
RockMUG Webmaster
Webmaster <at> RockMUG.org
www.RockMUG.org

Bob Rosenberg | 1 Jul 06:47 2002

Re: Reference to non-SGML character Error Message


At 16:13 -0400 on 06/27/2002, Kennedy, Richard T wrote about 
Reference to non-SGML character Error Message:

><x-charset iso-8859-1>In HTML files with the following DTD
><!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
>"http://www.w3.org/TR/html4/strict.dtd">
>certain character entities are flagged as errors. Please see the example
>below:
>
>  ... lass="bold">*M/S/C Creighton/O&#146;Keefe Moved to go into E ...
>                                               ^
>      Error: reference to non-SGML character
>
>I'm not exactly sure why that is technically invalid but I am not aware of
>any browsers that have a problem with it. The problem also happens with
>character entities &#148; &#150;

What CHARSET did you declare (or default to) in your META tag? If it 
was ISO-8859-1 then the codes in the 128-159 codepoint range are 
INVALID and should either be coded using &name; entries (such as 
&ldquo;) or as their Unicode Codepoint equivalents (&#8016/17/20/21; 
for the "curly quotes" if I remember correctly).

If you DO claim to be WIN-1252 then (at least in theory) the 
Validator is in error since the use of the 129-159 codepoints are 
VALID since they represent valid glyphs in that character-set.

>
>Also, I tried it with another DTD validator and it did not report a problem.
(Continue reading)

Liam Quinn | 1 Jul 06:54 2002

Re: Reference to non-SGML character Error Message


On Mon, 1 Jul 2002, Bob Rosenberg wrote:

> What CHARSET did you declare (or default to) in your META tag? If it
> was ISO-8859-1 then the codes in the 128-159 codepoint range are
> INVALID and should either be coded using &name; entries (such as
> &ldquo;) or as their Unicode Codepoint equivalents (&#8016/17/20/21;
> for the "curly quotes" if I remember correctly).
>
> If you DO claim to be WIN-1252 then (at least in theory) the
> Validator is in error since the use of the 129-159 codepoints are
> VALID since they represent valid glyphs in that character-set.

No, that's not true. &#ddd; always refers to the Unicode code point,
regardless of the Content-Type's charset parameter.

--

-- 
Liam Quinn

Jarno.Elovirta | 2 Jul 12:43 2002
Picon

[BUG] Character encoding not detected correctly with SGML SHORTTAG


Hi,

Character Encoding is currently detected erroneously when the document uses SGML SHORTTAG constructs.
The following document is valid SGML document and parses without errors (using SP 1.3.4):

  <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0 Strict//EN"><TITLE/test document/<META
http-equiv=Content-Type content="text/html;charset=ISO-8859-1"<P>

However, the W3C Validator fails to read the character encoding information from the META element and
issues a warning. The following document is the same document, but with the SHORTTAG construct not used in
the META element.

  <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0 Strict//EN"><TITLE/test document/<META
http-equiv=Content-Type content="text/html;charset=ISO-8859-1"><P>

This passes the validation without warnings. Both documents have the exact same parse tree:

    AVERSION CDATA -//IETF//DTD HTML 2.0 Strict//EN
  <HTML>
    <HEAD>
      <TITLE>
         test document
      </TITLE>
        AHTTP-EQUIV TOKEN CONTENT-TYPE
        ACONTENT CDATA text/html;charset=ISO-8859-1
      <META>
      </META>
    </HEAD>
    <BODY>
(Continue reading)

Nick Kew | 2 Jul 22:27 2002

Re: [NotaBUG] Character encoding not detected correctly with SGML SHORTTAG


This is yet another reason - over and above the usual SHORTTAG FAQs -
to make warning about shortags the default validator behaviour.

Is there anything in the HTML spec (as opposed to a bug in the DTD)
that justifies allowing shorttags?  ISTM that browser behaviour alone
is a strong enough reason to ban them.

--

-- 
Nick Kew

Available for contract work - Programming, Unix, Networking, Markup, etc.

Terje Bless | 2 Jul 22:55 2002
Picon

Re: [BUG] Character encoding not detected correctly with SGML SHORTTAG


Jarno.Elovirta <at> nokia.com wrote:

>Character Encoding is currently detected erroneously when the document
>uses SGML SHORTTAG constructs. The following document is valid SGML
>document and parses without errors (using SP 1.3.4):
>
><!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0 Strict//EN"><TITLE/test
>document/<META 
>http-equiv=Content-Type content="text/html;charset=ISO-8859-1"<P>

Well, it's certainly a bug in the sense that our heuristics are failing
when presented with SHORTTAGS-using HTML. *BUT*, as Nick says, SHORTTAGS in
itself is a bug, IMO, in the SGML Declaration for HTML. As is the use of
inband encoding information (still only IMO).

Thanks for the report though. I'm going to look into how/if we can improve
on the charset detection and file it as a "known issue". A future version
of the validator will probably warn about the use of SHORTTAGS though, due
to it's many problems.

Thanks for the feedback, and please do let us know if you find any more
such issues!

--

-- 
>For all I know they probably have a standard for
>which direction to put the thread on a bolt.

That would be ISO 261:1973.         -- John Cowan

(Continue reading)

Anton Gombkötö | 3 Jul 10:20 2002

iso_8859-1 still not recognized as valid.


Maybe i didn't express clear enough what's wrong.

So i let this
http://validator.w3.org/check?uri=http%3A%2F%2Fwww.avenum.com%2Fnet.data%2Fthereisnosuchpage.html%2Fstart&charset=%28detect+automatically%29&doctype=Inline
talk for itself.

Please let the validator know that iso_8859-1 is valid.

Thanks in advance!

best regards /  Mit freundlichen Grüssen

Anton Gombkötö
Avenum Technologie GmbH

http://www.avenum.com 

Guillaume Cocatre-Zilgien | 3 Jul 12:30 2002

Re: iso_8859-1 still not recognized as valid.


Anton Gombkötö wrote:

> Please let the validator know that iso_8859-1 is valid.

Shouldn't it be "iso-8859-1" ?
--------------------^

--

-- 
Guillaume Cocatre-Zilgien
Web Developer
 <at> listen_to Underworld - Beaucoup Fish


Gmane