issues | 1 Jun 04:20 2004

[issues] Created: (FOR-155) forrest does not validate documents

Message:

  A new issue has been created in JIRA.

---------------------------------------------------------------------
View the issue:

  http://issues.cocoondev.org/jira//secure/ViewIssue.jspa?key=FOR-155

Here is an overview of the issue:
---------------------------------------------------------------------
        Key: FOR-155
    Summary: forrest does not validate documents
       Type: Bug

     Status: Unassigned
   Priority: Major

    Project: Forrest
  Component: XML grammars & validation
   Versions:
             HEAD

   Assignee: 
   Reporter: Dave Brondsema

    Created: Tue, 1 Jun 2004 4:20 AM
    Updated: Tue, 1 Jun 2004 4:20 AM

Description:
(Continue reading)

issues | 1 Jun 04:41 2004

[issues] Created: (FOR-156) broken links building forrest site

Message:

  A new issue has been created in JIRA.

---------------------------------------------------------------------
View the issue:

  http://issues.cocoondev.org/jira//secure/ViewIssue.jspa?key=FOR-156

Here is an overview of the issue:
---------------------------------------------------------------------
        Key: FOR-156
    Summary: broken links building forrest site
       Type: Bug

     Status: Assigned
   Priority: Major

    Project: Forrest
  Component: shbat distribution - Forrest CLI
   Versions:
             HEAD

   Assignee: Steven Noels
   Reporter: Dave Brondsema

    Created: Tue, 1 Jun 2004 4:41 AM
    Updated: Tue, 1 Jun 2004 4:41 AM

Description:
(Continue reading)

Thorsten Scherler | 1 Jun 09:56 2004
Picon

Re: [Skins] Donation of the "lenya-site" skin

Dave Brondsema wrote:

>Quoting "Scherler, Thorsten" <thorsten <at> apache.org>:
>  
>
>>[...] I can commit the "lenya-site" skin. It is not finished though 
>>but I am working on that.
>>[...]
>>
>Since it is not finished and we are going to release 0.6 before too long, I
>think it would be best to wait.  
>

Yeah, I thought so ;-) and can perfectly understand that!

>But it does look very nice and we definitely
>would like to include it eventually.  
>

:) Thanks. The problem is the menu ATM :(. ...but I will have a closer 
look on http://xml.apache.org/forrest/linking.html. I reckon I will find 
some "update to 0.6" issues ;-)

[...]

>A better option might be to package it and host it yourself and then we can add
>it to http://xml.apache.org/forrest/skins/skins.xml (the default skins
>descriptor) 
>

(Continue reading)

David Crossley | 1 Jun 13:08 2004
Picon

Re: [Skins] Donation of the "lenya-site" skin

Thorsten Scherler wrote:
> Dave Brondsema wrote:
> >A better option might be to package it and host it yourself and then we can add
> >it to http://xml.apache.org/forrest/skins/skins.xml (the default skins
> >descriptor) 
> 
> Jupp, lets do that then. I saw 
> http://xml.apache.org/forrest/skin-package.html [1]. So to get it straight:
> I created the lenya-site skin by cp the krysalis-site skin and added the 
> lenya specific stuff there. So now my skin is in 
> src/core/context/skins/lenya-site.
> 
> [1] states:
> 
> 1 - forrest package-skin
> The skin package will be made in the skin dir, next to the custom skin.
> 
> So I have to run "forrest package-skin" within src/core/context/skins/, 
> right?

Run it at the top-level project directory.

> That will create my lenya-site.zip in src/core/context/skins/, right?
> After that I can upload it to our webserver as *.zip, right?

That is correct.

> >so people can still download it and use it easily.
>
> That is the aim to publish our skin. Right now I am the only one that 
(Continue reading)

Brian S. Hayes | 1 Jun 17:51 2004
Picon
Picon

RE: [Proposal] Forrest Signature in HTML source

> >
> > I just updated the lenya-site. Have a look at the source code of the
> > page http://cocoon.apache.org/lenya/. You will find in the head:
> > <meta content="Apache Forrest" name="Generator">
> 
> We should put the version in there too.
> 
> I think it's a good idea to put it at the skin level.  We can have a <meta
> tag
> that has the skin name.  And more importantly some skins might not want to
> have
> any <meta tags, and we shouldn't try to force them to.

I think we should try to set up this meta information so that it will always
appear unless it is overridden in the skin configuration or by a change in
the skin.  Perhaps in the skinconf we could have elements like
disable-generator meta information (perhaps some information cannot be
disable this way, like the Forrest version number).

I think most people will not care that this meta-information is embedded in
the web pages.  Thus, an alternative way to disable this information is to
have an insert-generator-meta-data xslt template that skin authors/hackers
could easily override (by defining their own) in their own site2html.xsl
skin.

Also, we/I should researching putting Forrest generation meta-data into the
PDF documents.

/Brian

(Continue reading)

Brian S. Hayes | 1 Jun 17:51 2004
Picon
Picon

RE: [DRAFT] request Apache Forrest TLP setup


Re: Wiki
My vote is for a wiki that requires a sign-in: valid email address and
password.  I've done it that way in the past.

/Brian

Jan.Materne | 1 Jun 18:17 2004
Picon

RE: [DRAFT] request Apache Forrest TLP setup

+1 (not binding vote)

Would end (ok, decrease) the amount of spam-changes.


Jan


> -----Original Message-----
> From: Brian S. Hayes [mailto:brian.hayes2 <at> comcast.net]
> Sent: Tuesday, June 01, 2004 5:51 PM
> To: forrest-dev <at> xml.apache.org
> Subject: RE: [DRAFT] request Apache Forrest TLP setup
>
>
>
>
> Re: Wiki
> My vote is for a wiki that requires a sign-in: valid email address and
> password.  I've done it that way in the past.
>
> /Brian
>
>

Stefano Mazzocchi | 1 Jun 19:11 2004
Picon

Re: [DRAFT] request Apache Forrest TLP setup

Antonio Gallardo wrote:

> Stefano Mazzocchi dijo:
> 
>>Antonio Gallardo wrote:
>>
>>
>>>I am aware there can be more pages generated but at least 6 links for free
>>>on the net is good deal, right? ;-)
>>
>>Antonio, having a link on the net is useless in google-sense if there is
> 
> no <a href=""> tag around it.
> 
> Not sure. I think they index even outside <a href="">. As a sample see:
> 
> http://www.google.com.ni/search?hl=en&ie=UTF-8&q=%22www.emmss.com%22+cocoon+wiki&btnG=Search
> 
> The third answer is a page with no <a href="">:
> 
> http://www.svg.org/wiki/ow.asp?p=OtherImplementations&a=diff
> 
>>Don't get me wrong, I'm the first to think that the web is becoming too
> 
> big of a place for open wikis to work (just like to open email to work),
> but I think you are just irrationally overreacting if you think they get
> that much benefit out of it.
> 
> I cannot get you wrong. ;-)
> 
> It don't takes me too much time. Just curious about that and I wonder why
> they are doing this? I don't read chineese. I don't know what they write.
> Why they are using robots to do that? If not more pages results, then what
> is the deal behind that?
> 
> Why i think they are using robots:
> 
> I wrote a mail 15 days ago [1]. At that time they had cca. 19,300 Now they
> have 69,000 results on google [2]. I don't believe people can do that by
> hand.  Doing simple maths (and supposing people use 6 hours daily for
> sleep and eat), then they need to put a link every 20 secs. It must be a
> robot!

of course it's a robot. My point is: the only useful attack to pagerank 
is the "linking attack" and using robots to hack wikis is a great 
strategy to improve the ranking of your site.

Now, in order to do this, you have to have the page including an <a 
href="">...</a> tag pointing to it, otherwise google doesn't use that 
information for its graph analysis pagerank system.

So, this means that their web site URL gets tokenized and indexed (as 
you demonstrate above), but that will *NOT* help increasing the pagerank 
value of the pages hosted on that URL.

Again, my point is: it doesn't matter if they have their URL tokenized a 
million time, what's important is that their pagerank value of that URL 
remains rubbish. And for this we are safe if and only if the wiki diff 
emails don't get represented with <a href=""> tags around URLs [but here 
I'm not sure if we do that!]

--

-- 
Stefano.

Attachment (smime.p7s): application/x-pkcs7-signature, 4290 bytes
Stefano Mazzocchi | 1 Jun 19:14 2004
Picon

Re: wiki vandalism (Was: [DRAFT] request Apache Forrest TLP setup)

David Crossley wrote:

> Stefano Mazzocchi wrote: 
> 
>>Antonio Gallardo wrote:
>>
>>
>>>I am aware there can be more pages generated but at least 6 links for free
>>>on the net is good deal, right? ;-)
>>
>>Antonio, having a link on the net is useless in google-sense if there is 
>>no <a href=""> tag around it.
> 
> 
> ??? ... not sure what you mean. They have been vandalising the
> Cocoon wiki pages by changing our links to point to their websites.
> Google loves our web pages because they have such good content.
> Apache wiki front pages have many incoming references from other
> pages - Google loves it more. Our Wiki pages have links to each
> page's history - Google gets excited. Google users discover our
> pages and accidently follow the links. The vandals win.

I was talking of pagerank attacks.

--

-- 
Stefano.

Attachment (smime.p7s): application/x-pkcs7-signature, 4290 bytes
Dave Brondsema | 1 Jun 19:30 2004
Picon

Re: [DRAFT] request Apache Forrest TLP setup

On Tue, 1 Jun 2004, Stefano Mazzocchi wrote:

> Antonio Gallardo wrote:
>
> > Stefano Mazzocchi dijo:
> >
> >>Antonio Gallardo wrote:
> >>
> >>
> >>>I am aware there can be more pages generated but at least 6 links for free
> >>>on the net is good deal, right? ;-)
> >>
> >>Antonio, having a link on the net is useless in google-sense if there is
> >
> > no <a href=""> tag around it.
> >
> > Not sure. I think they index even outside <a href="">. As a sample see:
> >
> > http://www.google.com.ni/search?hl=en&ie=UTF-8&q=%22www.emmss.com%22+cocoon+wiki&btnG=Search
> >
> > The third answer is a page with no <a href="">:
> >
> > http://www.svg.org/wiki/ow.asp?p=OtherImplementations&a=diff
> >
> >>Don't get me wrong, I'm the first to think that the web is becoming too
> >
> > big of a place for open wikis to work (just like to open email to work),
> > but I think you are just irrationally overreacting if you think they get
> > that much benefit out of it.
> >
> > I cannot get you wrong. ;-)
> >
> > It don't takes me too much time. Just curious about that and I wonder why
> > they are doing this? I don't read chineese. I don't know what they write.
> > Why they are using robots to do that? If not more pages results, then what
> > is the deal behind that?
> >
> > Why i think they are using robots:
> >
> > I wrote a mail 15 days ago [1]. At that time they had cca. 19,300 Now they
> > have 69,000 results on google [2]. I don't believe people can do that by
> > hand.  Doing simple maths (and supposing people use 6 hours daily for
> > sleep and eat), then they need to put a link every 20 secs. It must be a
> > robot!
>
> of course it's a robot. My point is: the only useful attack to pagerank
> is the "linking attack" and using robots to hack wikis is a great
> strategy to improve the ranking of your site.
>
> Now, in order to do this, you have to have the page including an <a
> href="">...</a> tag pointing to it, otherwise google doesn't use that
> information for its graph analysis pagerank system.
>
> So, this means that their web site URL gets tokenized and indexed (as
> you demonstrate above), but that will *NOT* help increasing the pagerank
> value of the pages hosted on that URL.
>
> Again, my point is: it doesn't matter if they have their URL tokenized a
> million time, what's important is that their pagerank value of that URL
> remains rubbish. And for this we are safe if and only if the wiki diff
> emails don't get represented with <a href=""> tags around URLs [but here
> I'm not sure if we do that!]
>

But even if the diff emails don't help them, the wiki pages themselves and
the history pages still have all the links, with <a> tags around the URLs.
So the vandal still benefits.  And of course our wiki gets defaced and
looks bad until we take the time to fix it.

--

-- 
Dave Brondsema : dave <at> brondsema.net
http://www.brondsema.net : personal
http://www.splike.com : programming
http://csx.calvin.edu : student org


Gmane