Lars Aronsson | 1 Dec 2011 04:08
Picon
Favicon

Re: [cultural-partners] [Wikisource-l] ABBYY Finereader 11 on Toolserver: do we like it?

On 11/30/2011 09:55 PM, Eugene Zelenko wrote:
> ABBYY has own online OCR service http://finereader.abbyyonline.com

This is very interesting, OCR as a cloud service. I didn't know they
were doing this. They charge EUR 7 per 200 pages, or US$ 0.05
per page, which I guess can be (almost) reasonable for the
Wikimedia Foundation to pay. I sometimes feel bad because I have
OCRed so many tens of thousand pages with a single EUR 129
license of Finereader. Here, EUR 129 would buy us 3700 pages.

All languages of Wikisource together are proofreading slightly
less than 900 pages/day, for which OCR would cost EUR 32/day
or US$ 43/day. With good OCR, proofreading is more fun, and
these numbers may increase. But then again, we wouldn't need
the service for all pages, as some books already have OCR.

The most interesting feature of a cloud-based OCR service, is
if they can accumulate improvements in font training (?) and
dictionaries from a large number of users over time. With
Wikisource, they can of course get direct access to the page
after proofreading.

So, is the service any good? They even promise to do Fraktur
(blackletter). Does it work well?

--

-- 
   Lars Aronsson (lars@...)
   Aronsson Datateknik - http://aronsson.se
Lars Aronsson | 1 Dec 2011 04:50
Picon
Favicon

Re: [Wikisource-l] [cultural-partners] ABBYY Finereader 11 on Toolserver: do we like it?

On 12/01/2011 04:08 AM, Lars Aronsson wrote:
> On 11/30/2011 09:55 PM, Eugene Zelenko wrote:
>> ABBYY has own online OCR service http://finereader.abbyyonline.com
>
> So, is the service any good? They even promise to do Fraktur
> (blackletter). Does it work well?

After having tried it, I'm less enthusiastic. The web user interface
is only upload images, download OCR text. There is no interaction
with adjusting segments / zones or training the OCR output. Only
40 languages are supported, and there is no way to indicate
special dictionaries for old spelling. Blackletter is only supported
for German and Latvian. The upload button is based on Flash,
and didn't quite work in Firefox on Linux, but it worked in Opera.

It worked OK for a modern (not blackletter) Norwegian text from
the 1930s. An advantage is that you can start as low as 50 pages
for EUR 3.50. Double that and you get 200 pages. For advanced
jobs, I still recommend buying the Professional edition, but some
users might find the online version useful.

--

-- 
   Lars Aronsson (lars@...)
   Aronsson Datateknik - http://aronsson.se
Howard Cheng | 6 Dec 2011 20:17
Favicon

Rotated images

I've noticed recently a number of images that require rotation, usually photos that are in landscape orientation when they should be in portrait. What's weird is that in the articles, MediaWiki already "knows" the correct orientation and in some cases the file history even shows multiple uploads. It's hard to explain in words so here's an example:

http://commons.wikimedia.org/wiki/File:Castelo_Sao_Jorge_Lisboa_3.JPG

This image has been in the article since September 2006 so I can't imagine that nobody has noticed the wrong orientation for 5 years. Does anyone know what happened here? Is this related to the huge backlog for RotateBot?

Thanks.

_______________________________________________
Commons-l mailing list
Commons-l@...
https://lists.wikimedia.org/mailman/listinfo/commons-l
Maarten Dammers | 6 Dec 2011 20:30
Picon

Re: Rotated images

Hi Howard,

I hope https://commons.wikimedia.org/wiki/Commons:Rotation will answer your questions.

Maarten

Op 6-12-2011 20:17, Howard Cheng schreef:

I've noticed recently a number of images that require rotation, usually photos that are in landscape orientation when they should be in portrait. What's weird is that in the articles, MediaWiki already "knows" the correct orientation and in some cases the file history even shows multiple uploads. It's hard to explain in words so here's an example:

http://commons.wikimedia.org/wiki/File:Castelo_Sao_Jorge_Lisboa_3.JPG

This image has been in the article since September 2006 so I can't imagine that nobody has noticed the wrong orientation for 5 years. Does anyone know what happened here? Is this related to the huge backlog for RotateBot?

Thanks.



_______________________________________________ Commons-l mailing list Commons-l-RusutVdil2icGmH+5r0DM0B+6BGkLq7r@public.gmane.org https://lists.wikimedia.org/mailman/listinfo/commons-l

_______________________________________________
Commons-l mailing list
Commons-l@...
https://lists.wikimedia.org/mailman/listinfo/commons-l
Neil Kandalgaonkar | 6 Dec 2011 23:59
Picon

Re: Rotated images

Going back to the old way (not auto-rotating) is probably not 
preferable. But maybe we should grandfather in all images uploaded 
before MW 1.18.

And, we really need a rotation button so users can fix this.

On 12/6/11 11:17 AM, Howard Cheng wrote:
> I've noticed recently a number of images that require rotation, usually
> photos that are in landscape orientation when they should be in
> portrait. What's weird is that in the articles, MediaWiki already
> "knows" the correct orientation and in some cases the file history even
> shows multiple uploads. It's hard to explain in words so here's an example:
>
> http://commons.wikimedia.org/wiki/File:Castelo_Sao_Jorge_Lisboa_3.JPG
>
> This image has been in the article since September 2006 so I can't
> imagine that nobody has noticed the wrong orientation for 5 years. Does
> anyone know what happened here? Is this related to the huge backlog for
> RotateBot?
>
> Thanks.
>
>
>
> _______________________________________________
> Commons-l mailing list
> Commons-l@...
> https://lists.wikimedia.org/mailman/listinfo/commons-l

--

-- 
Neil Kandalgaonkar  |) <neilk@...>
Jean-Frédéric | 7 Dec 2011 00:37
Picon

Re: Rotated images

2011/12/6 Neil Kandalgaonkar <neilk-AeOJrEpdGNeGglJvpFV4uA@public.gmane.org>

And, we really need a rotation button so users can fix this.
_______________________________________________
Commons-l mailing list
Commons-l@...
https://lists.wikimedia.org/mailman/listinfo/commons-l
Béria Lima | 7 Dec 2011 09:53
Picon

Re: Rotated images

This one is used to make a resquest so RotateBot can fix the image. What he said (at least what I understood) was that we need a button so the user can do himself, instead of ask the bot to do.
_____
Béria Lima
(351) 925 171 484

Imagine um mundo onde é dada a qualquer pessoa a possibilidade de ter livre acesso ao somatório de todo o conhecimento humano. Ajude-nos a construir esse sonho.


2011/12/6 Jean-Frédéric <jeanfrederic.wiki-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2011/12/6 Neil Kandalgaonkar <neilk-AeOJrEpdGNeGglJvpFV4uA@public.gmane.org>
And, we really need a rotation button so users can fix this.

_______________________________________________
Commons-l mailing list
Commons-l-RusutVdil2icGmH+5r0DMw@public.gmane.orgorg
https://lists.wikimedia.org/mailman/listinfo/commons-l


_______________________________________________
Commons-l mailing list
Commons-l@...
https://lists.wikimedia.org/mailman/listinfo/commons-l
David Gerard | 8 Dec 2011 17:06
Picon
Gravatar

Picture rotation: what on earth?

http://commons.wikimedia.org/wiki/File:David_Gerard,_Heathrow_Terminal_5,_20110801_P1020447.jpg

Who thought this was a good idea to automate?

And how does one get this fixed? I can't even revert to a good copy.

Has a list of these been made and human-checked?

- d.
Nathan | 8 Dec 2011 17:27
Picon

Re: Picture rotation: what on earth?

Just to understand - on the last thread about this, it sounded like
the October 5 update forces MediaWiki to determine rotation based on
EXIF data. If the EXIF data is wrong or missing, the rotation may be
incorrect. As a result, a bot (RotateBot) with a gigantic backlog is
slowly fixing incorrect rotations?

On Thu, Dec 8, 2011 at 11:06 AM, David Gerard <dgerard@...> wrote:
> http://commons.wikimedia.org/wiki/File:David_Gerard,_Heathrow_Terminal_5,_20110801_P1020447.jpg
>
> Who thought this was a good idea to automate?
>
> And how does one get this fixed? I can't even revert to a good copy.
>
> Has a list of these been made and human-checked?
>
>
> - d.
>
> _______________________________________________
> Commons-l mailing list
> Commons-l@...
> https://lists.wikimedia.org/mailman/listinfo/commons-l
Tobias Oelgarte | 8 Dec 2011 18:02

Re: Picture rotation: what on earth?

Yes you are right. It only hits images with present EXIF data that has 
wrong rotation values. Therefore all images uploaded with wrong EXIF 
data have to be tagged by a template so that the bot can through the 
pages and correct the EXIF tag to have the right value.

I'm counting myself to the lucky ones that never uploaded images with 
EXIF tags, since i found them always useless. The stored data is neither 
sufficient for real tasks and it can easily be faked. Now the rotation 
is used and it causes more problems then benefits. ;-)

nya~

Am 08.12.2011 17:27, schrieb Nathan:
> Just to understand - on the last thread about this, it sounded like
> the October 5 update forces MediaWiki to determine rotation based on
> EXIF data. If the EXIF data is wrong or missing, the rotation may be
> incorrect. As a result, a bot (RotateBot) with a gigantic backlog is
> slowly fixing incorrect rotations?
>
> On Thu, Dec 8, 2011 at 11:06 AM, David Gerard<dgerard@...>  wrote:
>> http://commons.wikimedia.org/wiki/File:David_Gerard,_Heathrow_Terminal_5,_20110801_P1020447.jpg
>>
>> Who thought this was a good idea to automate?
>>
>> And how does one get this fixed? I can't even revert to a good copy.
>>
>> Has a list of these been made and human-checked?
>>
>>
>> - d.
>>
>> _______________________________________________
>> Commons-l mailing list
>> Commons-l@...
>> https://lists.wikimedia.org/mailman/listinfo/commons-l
> _______________________________________________
> Commons-l mailing list
> Commons-l@...
> https://lists.wikimedia.org/mailman/listinfo/commons-l
>

Gmane