Michael Piotrowski | 1 Feb 2009 19:43
X-Face
Picon

Call for Papers: Workshop on Systems and Frameworks for Computational Morphology (sfcm 2009)

Apologies if you receive multiple copies of this message.

Please distribute it to colleagues.

********************************************************************
			   Call for Papers

Workshop on Systems and Frameworks for Computational Morphology
			     (sfcm 2009)

			 <http://sfcm2009.org>

		   Workshop date: September 4, 2009

	     Location: University of Zurich, Switzerland

		  Submission deadline: March 1, 2009

*********************************************************************

>From the point of view of computational linguistics, morphological
resources are the basis for all higher-level applications. This is
especially true for languages with a rich morphology like German. A
morphology component should thus be capable of analyzing single
wordforms as well as whole corpora. For many practical applications,
not only morphological analysis, but also generation is required,
i.e., the production of surfaces corresponding to specific categories.

Apart from uses in computational linguistics, there are practical
applications that can benefit from morphological analysis and/or
(Continue reading)

Damir Ćavar | 2 Feb 2009 09:58
Picon
Favicon

Call for Papers: 4th Annual Meeting of the Slavic Linguistic Society

Please distribute it to colleagues.

********************************************************************
			   Call for Papers

       Fourth Annual Meeting of the Slavic Linguistic Society
			     (SLS 2009)

                   <http://ling.unizd.hr/~sls2009/>

		   Conference date: September 3.-6., 2009

	     Location: University of Zadar, Croatia

		  Submission deadline: June 1, 2009

*********************************************************************

We invite you to submit an abstract to the fourth meeting of the
Slavic Linguistics Society, to be held on the campus of the University
of Zadar, in beautiful Zadar, Croatia, 3rd-6th September 2009.

The purpose of SLS is to create a community of students and scholars
interested in Slavic linguistics, that is, the systematic and scholarly
study of the Slavic languages. The Society aspires to be as open and
inclusive as possible; no school, framework, approach, or theory is
presupposed, nor is there any restriction in terms of geography,
academic affiliation or status.

Papers dealing with any aspect of Slavic linguistics and within any
(Continue reading)

Bas Aarts | 2 Feb 2009 11:20
Picon
Picon
Favicon

Extended deadline: Third International Conference on the Linguistics of Contemporary English (ICLCE3)

Extended deadline for abstract submission: 12 February 2009

The Third International Conference on the Linguistics of Contemporary
English (ICLCE3)

14/15-17 July 2009

Institute of English Studies
Senate House
University of London

The attention devoted to the linguistics of the English language has 
resulted in a broad body of work in diverse research traditions. The aim of 
the ICLCE conference is to encourage the cross-fertilisation of ideas 
between different frameworks and research traditions, all of which may 
address any aspect of the linguistics of contemporary English. The first 
and second ICLCE conferences were held in Edinburgh (2005) and Toulouse 
(2007) along the same lines. We aim for the London conference to build on 
the success of those events.

The main conference will be preceded on 14 July 2009 by a one-day symposium 
to celebrate the 50th anniversary of the Survey of English Usage, founded 
by Randolph Quirk at UCL. The theme of this symposium will be 'Current 
Change in the English Verb Phrase'.

Plenary speakers at the main conference:

James Blevins (Cambridge)
Bernd Kortmann (Freiburg)
James M. Scobbie (Queen Margaret University, Edinburgh)
(Continue reading)

Aitor Soroa Etxabe | 2 Feb 2009 14:57
Picon

Software release: UKB, Graph Based Word Sense Disambiguation and Similarity


Dear list members,

we are pleased to announce the public release of version 0.1.0 of UKB,
a collection of programs for performing graph-based Word Sense
Disambiguation and lexical similarity/relatedness using a pre-existing
lknowledge base.

UKB has been developed by the IXA group <http://ixa.si.ehu.es> in the
University of the Basque Country. UKB applies the so-called
Personalized PageRank on a Lexical Knowledge Base (LKB) to rank the
vertices of the LKB and thus perform disambiguation. The details of
the method are described in [1]. The algorithm can also be used to
calculate lexical similarity/relatedness of words/sentences. See [2]
for an application of UKB on word similarity.

The software can be downloaded from:

http://ixa2.si.ehu.es/ukb

Best,

Eneko Agirre
Aitor Soroa

Ixa research group
http://ixa.si.ehu.es

*******

(Continue reading)

Daniel Zeman | 2 Feb 2009 15:12
Picon
Favicon

Re: Universal POS Tagset

Hi Adam,

I've been working on similar stuff and have had a poster at last year's 
LREC:
http://ufal.mff.cuni.cz:8080/bib/?section=publication&id=-6437616343801484763&mode=view
(and the framework is here:
https://wiki.ufal.ms.mff.cuni.cz/user:zeman:interset )

However, my universal tagset is a virtual one - it's a definition of 
possible features and their values, not exactly a set of tags (encoded 
as strings). Also, it's work in progress, and changes will be needed to 
achieve universality. Anyway, let me know if I can be of any help.

Best,
Dan

Adam Teichert napsal(a):
> Hello all.
>
>
>   I've been looking for a POS tagset that is general enough to
> effectively tag "any" natural language.  (I'm looking at Linguistic
> Typology / Universal Implications so I want to compare POS taggings
> across many [possibly obscure] languages.) Does anyone know of such a
> tagset?
>
>   If anyone is interested in what I've found so far, this paper seems relevant:
>     "Induction of Fine-grained Part-of-speech Taggers via Classifier
> Combination and Crosslingual Projection" (Elliott Franco Dr´abek,
> David Yarowsky)
(Continue reading)

Serge Sharoff | 2 Feb 2009 14:53
Picon
Favicon

Re: Universal POS Tagset

Another research project with similar goals is MTE:
http://nl.ijs.si/ME/V3/msd/html/

For a recent experiment on designing a tagset following this framework take a look at:
Serge Sharoff, Mikhail Kopotev, Tomaz Erjavec, Anna Feldman, and Dagmar Divjak. Designing and
evaluating a Russian tagset. In Proceedings of the Sixth Language Resources and Evaluation Conference,
LREC 2008, Marrakech, 2008.
http://corpus.leeds.ac.uk/mocky/lrec2008-msd.pdf

Serge

-----Original Message-----
From: corpora-bounces <at> uib.no on behalf of Adam Teichert
Sent: Fri 30/01/2009 20:53
To: corpora <at> uib.no
Subject: [Corpora-List] Universal POS Tagset

Hello all.

  I've been looking for a POS tagset that is general enough to
effectively tag "any" natural language.  (I'm looking at Linguistic
Typology / Universal Implications so I want to compare POS taggings
across many [possibly obscure] languages.) Does anyone know of such a
tagset?

  If anyone is interested in what I've found so far, this paper seems relevant:
    "Induction of Fine-grained Part-of-speech Taggers via Classifier
Combination and Crosslingual Projection" (Elliott Franco Dr´abek,
David Yarowsky)
    http://acl.ldc.upenn.edu/W/W05/W05-0807.pdf
(Continue reading)

Eric Atwell | 2 Feb 2009 15:55
Picon
Favicon

Re: Universal POS Tagset

Adam,

thanks for your interesting references. I've looked into development
of tag sets for part-of-speech tagging for English, Urdu, Arabic and
Malay:

Atwell, E. 2008. Development of tag sets for part-of-speech tagging. 
in: Anke Ludeling & Merja Kyto (editors) Corpus Linguistics: An 
International Handbook, Volume 1, pp. 501-526, Mouton de Gruyter. 
(preprint: http://www.comp.leeds.ac.uk/eric/atwell08clih.pdf)
http://www.degruyter.de/cont/imp/mouton/detailEn.cfm?isbn=978-3-11-021142-9

Corpus linguists have not been able to agree on a single poS-tagset for 
English, let alone a cross-language tag-set. The problem is the wide
range of (sometimes conflicting) criteria used in design of corpus PoStag
sets: "... mnemonic tag names; underlying linguistic theory; classification
by form or function; analysis of idiosyncratic words; categorization 
problems; tokenisation issues: defining what counts as a word; 
multi-word lexical items; target user and/or application;
availability and/or adaptability of tagger software; adherence to
standards; variations in genre, register, or type of language; 
and degree of delicacy of the tag set."

Perhaps a small PoS-tagset lacking "delicacy" or fine-grained
distinctions could apply across languages; e.g. the broad classes 
used by traditionla Arabic grammarians 
N (nouns) V (verbs) P (particles, i.e. others).
But arguably this is only useful to you if it reveals some syntacitc
universals, and I guess dividing all words into just 3 classes 
won't tell you much.
(Continue reading)

maxwell | 2 Feb 2009 16:35
Picon
Favicon

Re: Universal POS Tagset

> I've been looking for a POS tagset that is general enough to
> effectively tag "any" natural language.  (I'm looking at Linguistic
> Typology / Universal Implications so I want to compare POS taggings
> across many [possibly obscure] languages.) Does anyone know of such a
> tagset?

One of the issues is going to be at what level of detail one wants the
tags.  If it's just the standard parts of speech (noun, verb,
pre-/post-position...), it might not be hard to come up with a list,
although there would be problems in particular languages (is the 'for' of
English for-to clauses a preposition or a complementizer, and is there
really a difference?).

If on the other hand, you want to tag things like person, number etc.,
which plenty of taggers have done, then there is a very long list of
features and feature values which one might tag.  There are for example
languages which, in addition to the usual singular/ plural distinctions in
the number feature, distinguish dual, trial, paucal, etc.; and languages
which have far different gender classes than are dreamed of in most
categorizations.  And there are languages which morphologically mark verbs
for such things as agreement with ergative and absolutive arguments, and
evidential status (seen/ inferred/ reportedly etc.).

Yet another issue for standardized tag sets is that some morphosyntactic
feature values will cover a wider range in one language than they might in
another, or values will overlap in different ways in different languages. 
Case systems are notoriously like that.

I know of two efforts to come up with lists of tags (in addition to the
responses you've already gotten).  One is the ISO TC 37/SC4 effort for
(Continue reading)

Damir C'avar | 2 Feb 2009 17:48
Picon
Favicon

Re: Universal POS Tagset

maxwell <at> umiacs.umd.edu wrote:
>> I've been looking for a POS tagset that is general enough to
>> effectively tag "any" natural language.  (I'm looking at Linguistic
>> Typology / Universal Implications so I want to compare POS taggings
>> across many [possibly obscure] languages.) Does anyone know of such a
>> tagset?
>
>
>
> The other effort is the GOLD ontology,
> http://linguistics-ontology.org/gold.html.  This ontology has been
> populated by people who know about a very large variety of languages (with
> initial input from a list compiled by SIL).  It is not really intended as
> a list of tags (or of tag components), although you could use it that way,
> but rather it is intended as something that a tag list could be defined by
> reference to.  For example, it is common in Nahuatl to refer to the
> 'absolutive' form of a noun.  This has nothing to do with the ergative/
> absolutive distinction, but it is nevertheless a standard usage among
> Nahuatl (maybe even Uto-Aztecan) linguists.  The idea behind Gold is that
> a Nahuatl linguist would continue to use the standard 'absolutive' term/
> tag, but define it in terms of the categories in the Gold ontology.
>   

The GOLD ontology is missing some concepts (features and properties) for
some (maybe many) languages, but the process for extending it is
somewhat defined. There is e.g. a Google group where issues can be
discussed:

http://groups.google.hr/group/gold-ontology

(Continue reading)

Adam Przepiorkowski | 2 Feb 2009 17:46
Picon
Picon

NLP / SLP <at> IIS 2009 (and workshops), Cracow, Poland, 15-18 June 2009


Please find enclosed an extract from the last CFP for Intelligent
Information Systems 2009, to be held in the historical city of Cracow,
the former capital of Poland, on 16-18 June 2009.  See
http://iis.ipipan.waw.pl/ for the full Call for Papers.

One of the main tracks of the conference is Natural Language
Processing / Spoken Language Processing, covering about 1/2 of the
programme in 2008.

*Deadline* for paper submissions: *16 February 2009*.

There will be two pre-conference NLP-related events on 15 June 2009:

2nd International Workshop on Balto-Slavonic Natural Language
Processing - BSNLP 2009 (http://erssab.u-bordeaux3.fr/BSNLP)

and

The First Polish-German Information Science Workshop, featuring NLP
and SLP talks by Prof. Grażyna Demenko, Prof. Erhard Hinrichs and
Prof. Laura Kallmeyer, as well as a presentation of the National
Corpus of Polish project.

======================================================================

                   2nd CALL FOR PAPERS:
               International Joint Conference
      INTELLIGENT INFORMATION SYSTEMS 2009 -- IIS 2009

(Continue reading)


Gmane