Christian Chiarcos | 1 Sep 2011 09:29
Picon

Re: Farsi corpora

You might als take a look on the Farsi ressources in Multext-East:
http://nl.ijs.si/ME/

Christian

2011/8/31 Yorick Wilks <Y.Wilks <at> dcs.shef.ac.uk>:
> Thanks to everyone for very useful pointers.
> YW
>
>
> On 31 Aug 2011, at 16:20, Jon Dehdari wrote:
>
>> Hello,
>> There are a couple different public-domain/Free news corpora here:
>> http://ling.ohio-state.edu/~jonsafari/corpora
>>
>> The Hamshahri newspaper corpus is available here:
>> http://ece.ut.ac.ir/dbrg/Hamshahri
>>
>> The POS-tagged Bijankhan newspaper corpus is available here:
>> http://ece.ut.ac.ir/dbrg/Bijankhan
>>
>> And more information here:
>> http://www.iranianlinguistics.org/wiki/index.php?title=Persian#Corpora
>>
>>
>> Cheers,
>> -Jon Dehdari
>>
>>
(Continue reading)

Heliana Mello | 1 Sep 2011 14:28
Picon

SPEECH AND CORPORA: GSCP 2012 - Last call for papers

LAST CALL FOR PAPERS   (no further deadline extension)
September 12th, 2011

GSCP 2012 - SPEECH AND CORPORA
International Conference in memory of Claire Blanche-Benveniste

BELO HORIZONTE (BRAZIL),
FEBRUARY 29 – MARCH 2, 2012
Faculdade de Letras, Universidade Federal de Minas Gerais

Conference site: http://www.letras.ufmg.br/gscp2012

GSCP site: http://www.GSCP.it

Contact: infoGSCP2012 <at> gmail.com

Abstracts submission deadline: September 12th, 2011

Themes
Spoken Corpora
- Spoken corpora: compilation and data collection methodologies
- Spoken corpora annotation
- Statistic approaches to spoken corpora
- Speech and parsing
- Multilingual speech corpora
- Spoken corpora and representativeness
- Spoken corpora and variation
- Spoken corpora validation techniques
- Spoken corpora and speech pathologies

(Continue reading)

Valérie Mapelli | 1 Sep 2011 15:36

Re: Looking for an annotated corpus

Dear Denis,

In the ELRA Catalogue of Language Resources, you will find in particular 
the ARCADE II Evaluation Package which contains named entities for 
French and Arabic.
Please check here for further information: 
http://catalog.elra.info/product_info.php?products_id=992

Best regards,

Valérie Mapelli

Le 08/08/2011 10:57, denis a écrit :
> Dear Corpora-members,
>
> I'm Looking for a corpus providing named entities annotations 
> (essentially person, company and organization tags) to perform 
> evaluation on a named entity extractor.
> Could you help me to choose what's the best to use?
>
> many thanks in advance
>
> Denis Lebailly
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora <at> uib.no
> http://mailman.uib.no/listinfo/corpora
>
(Continue reading)

Thomas Schmidt | 1 Sep 2011 14:32
Picon
Favicon

GSCL conference 2011: Call for participation

The programs for the main conference and two pre-conference workshops of the

GSCL conference 2011 "MULTILINGUAL RESOURCES AND MULTILINGUAL APPLICATIONS"

are now online at http://www.corpora.uni-hamburg.de/gscl2011/en/?Program

Registration for the conference will remain open until September, 15
at http://www.corpora.uni-hamburg.de/gscl2011/en/?Registration_%26nbsp%3B

Invited Speakers are:

Hans Uszkoreit (Workshop "Contrastive Linguistics - Translation
Studies - Machine	Translation: what can we learn from each other?")
Michael Sperberg-McQueen (Workshop "Language Technology for a
Multilingual Europe")
Ralf Steinberger (main conference)
Hans C. Boas (main conference)
Felix Sasaki (main conference)

With best regards,

the LOC

--

-- 
Thomas Schmidt
Hamburger Zentrum für Sprachkorpora
Max Brauer-Allee 60
22765 Hamburg
Tel.: (040) 42838-6425
Fax.: (040) 42838-6116
(Continue reading)

Manuela Speranza | 1 Sep 2011 16:20
Picon
Favicon

EVALITA 2011 Second Call for Participation

<Apologies if you receive multiple copies>
<Please, distribute it among potentially interested colleagues>

********************************************************************

                          EVALITA 2011

      Evaluation of NLP and Speech Tools for Italian

http://www.evalita.it/2011

********************************************************************


EVALITA 2011 - Second Call for Participation
September 2011

We invite participation, both from academic institutions and
industrial organizations, in eleven tasks, all for Italian.

We are pleased to announce that both the training data and the
detailed guidelines for all tasks are available.

Text tasks
- Parsing
- Domain Adaptation
- Named Entity Recognition on Transcribed Broadcast News
- Cross-document Coreference Resolution of Named Person Entities
- Anaphora Resolution
- Super Sense Tagging
(Continue reading)

Masood Ghayoomi | 1 Sep 2011 16:27
Picon
Favicon

Re: Farsi corpora (Yorick Wilks)

You may check this linguistic database as well:
http://pldb.ihcs.ac.ir/

Some of the texts are lemmatized, POS-tagged, and also phonetic labels are assigned.
You might find a sample here:
http://pldb.ihcs.ac.ir/newsearch2/Asar.aspx

Cheers,
Masood


On Wed, Aug 31, 2011 at 03:54:31PM -0400, Yorick Wilks wrote:
>
> Is anyone aware of easily obtained Farsi corpora---domain not important?
> I'd be grateful for pointers.
> Yorick Wilks
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora <at> uib.no
http://mailman.uib.no/listinfo/corpora
Khalid CHOUKRI | 1 Sep 2011 18:58

Re: Farsi corpora

Hi Yorick

some Farsi resources are available from ELRA catalogue (including an English-Persian parallel Corpus)
Just search Farsi on http://catalog.elra.info/search.php

best regards
Khalid


Yorick Wilks wrote, On 31/08/2011 22:23:
Thanks to everyone for very useful pointers. YW On 31 Aug 2011, at 16:20, Jon Dehdari wrote:
Hello, There are a couple different public-domain/Free news corpora here: http://ling.ohio-state.edu/~jonsafari/corpora The Hamshahri newspaper corpus is available here: http://ece.ut.ac.ir/dbrg/Hamshahri The POS-tagged Bijankhan newspaper corpus is available here: http://ece.ut.ac.ir/dbrg/Bijankhan And more information here: http://www.iranianlinguistics.org/wiki/index.php?title=Persian#Corpora Cheers, -Jon Dehdari On Wed, Aug 31, 2011 at 03:54:31PM -0400, Yorick Wilks wrote:
Is anyone aware of easily obtained Farsi corpora---domain not important? I'd be grateful for pointers. Yorick Wilks
_______________________________________________ UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora Corpora mailing list Corpora <at> uib.no http://mailman.uib.no/listinfo/corpora
Attachment (choukri.vcf): text/x-vcard, 328 bytes
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora <at> uib.no
http://mailman.uib.no/listinfo/corpora
Albretch Mueller | 1 Sep 2011 23:44
Picon

Re: Hacked email accounts

On 8/29/11, Leon Derczynski <leon <at> dcs.shef.ac.uk> wrote:
> Hanlon's razor certainly has its applications.
~
// __ http://en.wikipedia.org/wiki/Hanlon%27s_razor
~
 "Never attribute to malice that which can be adequately explained by
stupidity (, but don't rule out malice)".
~
 Yet, regarding governmental bureaucracies (like people with mental
problems (which nowadays can actually be physically detected)),
specially those under a blanket of secrecy, there is no clear and
conscious line dividing stupidity from malice (craze from reality)
~
 Something that I find amazing (and to some extent even amusing) is
how under even so-called "free" societies some bureaucracies
(politicians, royalty, governments, ...) manipulating the rest of us
(apparently) truly believe that they can somehow substitute/patch
-making sense- with bureaucracies and/or syntactic devices a really
good/hilarious example of those kinds of governmental bureaucratic
cr <at> p, you can find in: "Number: The Language of Science" by Tobias
Dantzig, page 23 in which Dantzig recounts the cheer stupidity of the
British government trying to maintain the use of notched sticks
(resist more modern and way simpler enumeration techniques)
~
 lbrtchx

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora <at> uib.no
http://mailman.uib.no/listinfo/corpora

John F. Sowa | 2 Sep 2011 16:56

Re: Hacked email accounts

On 9/1/2011 5:44 PM, Albretch Mueller wrote:
> Something that I find amazing (and to some extent even amusing) is
> how under even so-called "free" societies some bureaucracies
> (politicians, royalty, governments, ...) ...

I'll hereby coin a new "razor":

    "Never attribute more incompetence to bureaucracies
    than to the average voter."

And its converse:

    "Never attribute more wisdom to the average voter
    than to the average bureaucrat."

These are corollaries of the principles enunciated by P. T. Barnum

    "Nobody ever lost money by underestimating
    the intelligence of the American people."

and by General William Tecumseh Sherman

    "Vox populi, vox humbug."

John

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora <at> uib.no
http://mailman.uib.no/listinfo/corpora

Albretch Mueller | 3 Sep 2011 00:39
Picon

Re: Hacked email accounts

// __ http://en.wikinews.org/wiki/Wikileaks_crashes_under_cyber_attack
~
 Very technical indeed!!! and moral response from Wikileaks:
~
 "Dear governments, if you don't want your filth exposed, then stop
acting like pigs. Simple".
~
 truth and peace and love
 lbrtchx

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora <at> uib.no
http://mailman.uib.no/listinfo/corpora


Gmane