CE Whitehead | 1 Dec 01:06
Picon

Criteria for languages?



 
Hi.  This is a follow-up to my previous (Wednesday, Nov. 25) post (http://www.alvestrand.no/pipermail/ietf-languages/2009-November/009620.html).
1.  Latgalian and Latvian
I went through the rest of the Latgalian language links.
The problem for the first newspaper linked to, based in the heart of Latgalia, is that neither the Latgalian language newspaper supplement--"Moras zeme"--nor the Latvian language newspaper "Rezeknes Vestis" makes use of anything but a character set declaration as far as I can determine.
However the newspaper "Latgales Laiks" (http://www.latgaleslaiks.lv/lv/) does use a Latvian language tag because its language is Latvian, while the cultural supplement in Latgalian to this newspaper, "Latgalisu Gazeta," does not use a language tag (this seems to verify that tags are being used for Latvian but not for Latgalian, but read on).
 
The publishing house does provide keywords in English, Russian, and Latvian, using appropriate language tags for this section but otherwise does not use a language tag.
Two of the music groups are linked to via http://www.borowa.lv/  whose content-language is identified as Latvian; I don't know either Latgalian or Latvian so cannot say but assume for now that the content is standard Latvian. 
(I could not link to the groups' sites themselves.)
Several of the literature sites do identify that the content is Latgalian in the keywords--I assume that this is done in lieu of tagging the language.
Latgalian has been developing separately from Latvian since 1621 according to Wikipedia and for a time Latgalia was under Russia until opting to join Latvia circa 1920:
"The language or dialect is called Latgalian.
From 2004 on, the Latgalian language is the subject of the biggest sociolinguistic/ethnolinguistic poll in Europe, held by the Rēzekne Augstskola and the Centre d'Étude Linguistiques Pour l'Europe."
In addition,
"Originally the territory of what is now Latgale was populated by Eastern Baltic tribes, whose language became the basis for both modern Latgalian and standard Latvian. Many Latgalians still speak the local dialect, which has a standardized written form and is therefore considered a separate language,"
according again to: http://en.wikipedia.org/wiki/Latgalia
According to www.ethnologue.com ,
the language lv (latvian) is spoken by among others, Latgalians; see:
http://www.ethnologue.com/show_language.asp?code=lav
(I  don't know if this means that they speak Latvian proper, or their language Latgalian is being considered to be Latvian, but I suspect the former and maybe also the latter since Latgalian is clasified as an East Latvian dialect--is this the answer to your question??)
To me, since Latgalia n has its own distinct writing system, and its own literature (thus meeting the criteria for a separate language at ethnologue; http://www.ethnologue.com/ethno_docs/introduction.asp#language_id),
I do not see any problem with its getting its own language subtag.
(Though Early Modern French is written differently than Modern French and has its own literature too and it only got a variant subtag--but this is a modern language of course.)

2. Walliser German
I have more questions about Walliser German--which does not seem to be a written language at all (though it may be in use in emails??).
My feeling is that Walliser German and Walser German are closely related enough that the can be classified together
but I may be wrong
(although Walliser German is at present neither classified nor mentioned at ethnologue--
it's sometimes classified together with Walser German as a Swiss German dialect;
however I've seen neither Walser nor Walliser German classified as a subdialect of the other but they are clearly related;
see: http://www.walser-alps.eu/dialect/walliser-german for the classification of these under Swiss German--
but Ethnologue has recently decided that these are not Swiss German). 
Once a language tag ([wts] or [wae]--I don't care) is assigned to both,
I'd be happy to see a request for a variant sub! tag for Walliser German.
This would prevent the Walliser German subtag's being deprecated down the road when a subtag for Walser German is approved--but neither is really written of course.
Best,
C. E. Whitehead
cewcathar <at> hotmail.com
Criteria for languages?
CE Whitehead cewcathar at hotmail.com
Wed Nov 25 01:02:21 CET 2009
 
> Hi.  I'm not sure what you are asking here, Mark.
>As far as Latgalian goes, according to Wikipedia (http://en.wikipedia.org/wiki/Latgalian_language),
> "Sometimes it is referred to as a distinct separate language, while others consider it to be a dialect of Latvian."

 
_______________________________________________
Ietf-languages mailing list
Ietf-languages <at> alvestrand.no
http://www.alvestrand.no/mailman/listinfo/ietf-languages
Doug Ewell | 1 Dec 02:19
Favicon

Re: Criteria for languages?

CE Whitehead <cewcathar at hotmail dot com> wrote:

> To me, since Latgalia n has its own distinct writing system, and its 
> own literature (thus meeting the criteria for a separate language at 
> ethnologue; 
> http://www.ethnologue.com/ethno_docs/introduction.asp#language_id),
> I do not see any problem with its getting its own language subtag.

For my part, at least, I have no problem with the idea that Latgalian 
should get its own language subtag if experts feel it is a distinct 
language.  My concern is with converting the existing "Latvian" to a 
macrolanguage, which implies that the term "Latvian" sometimes refers 
just to Standard Latvian and sometimes to both Standard Latvian and 
Latgalian.  I don't necessarily get the impression that the latter is 
true; it seems that when people mean Latgalian, they say "Latgalian."

--
Doug Ewell  |  Thornton, Colorado, USA  |  http://www.ewellic.org
RFC 5645, 4645, UTN #14  |  ietf-languages @ http://is.gd/2kf0s ­

_______________________________________________
Ietf-languages mailing list
Ietf-languages <at> alvestrand.no
http://www.alvestrand.no/mailman/listinfo/ietf-languages
Peter Constable | 1 Dec 18:02
Picon
Favicon

RE: Criteria for languages?

From: ietf-languages-bounces <at> alvestrand.no [mailto:ietf-languages-bounces <at> alvestrand.no] On
Behalf Of Doug Ewell

> My concern is with converting the existing "Latvian" to a macrolanguage, 
> which implies that the term "Latvian" sometimes refers just to Standard 
> Latvian and sometimes to both Standard Latvian and Latgalian.  

What you are describing is an issue of terminology and documentation: what do people mean when using the
term "Latvian". But what we are in fact dealing with is a _coding_ issue: how has "lv" / "lav" been used in
implementations, and what existing content is tagged "lv" / "lav"? 

There is clear evidence on the coding issue: MARC has used "lav" for Latgalian for some time.

If the denotation of "lav" were changed to explicitly exclude Latgalian (which would be necessary if its
scope is not set to macrolanguage), then an unknown number of librarians will have broken data. It would be
irresponsible of the ISO 639-RA/JAC to do such a thing, IMO.

Peter
John Cowan | 1 Dec 18:29

Re: Criteria for languages?

Peter Constable scripsit:

> If the denotation of "lav" were changed to explicitly exclude Latgalian
> (which would be necessary if its scope is not set to macrolanguage),
> then an unknown number of librarians will have broken data. It would
> be irresponsible of the ISO 639-RA/JAC to do such a thing, IMO.

Quite so.

In that case, the issue for us is: do we recommend that people continue
to use "lav" for Latvian proper, or that they adopt the new subtag?

--

-- 
John Cowan   cowan <at> ccil.org    http://ccil.org/~cowan
I come from under the hill, and under the hills and over the hills my paths
led. And through the air. I am he that walks unseen.  I am the clue-finder,
the web-cutter, the stinging fly. I was chosen for the lucky number.  --Bilbo
Michael Everson | 1 Dec 18:54
Favicon
Gravatar

Re: Criteria for languages?

On 1 Dec 2009, at 17:02, Peter Constable wrote:

> There is clear evidence on the coding issue: MARC has used "lav" for  
> Latgalian for some time.

No different from someone using "ger" to include Bavarian or Swiss.

> If the denotation of "lav" were changed to explicitly exclude  
> Latgalian (which would be necessary if its scope is not set to  
> macrolanguage), then an unknown number of librarians will have  
> broken data. It would be irresponsible of the ISO 639-RA/JAC to do  
> such a thing, IMO.

I do not believe that, erm, some "Lettish" is a "macrolanguage"  
including Latvian and Latgalian.

If librarians tagged some Latgalian data with "the nearest thing" that  
still does not mean that all Latvian-language data is broken. If "lav"  
becomes a macrolanguage then to be precise all Latvian books will have  
to be re-tagged.

Michael Everson * http://www.evertype.com/
Mark Davis ☕ | 1 Dec 18:54

Re: Criteria for languages?

I never got an answer as to the relevant difference between the Latvian case and the Swiss German case that would cause one to be a macrolanguage, and the other to be simply a split. Is this the key factor between them, that MARC has been using lav for Latgalian, and it hasn't been using gsw for Walliserdeutsch.

This would be a criterion that at least would be understandable.

Mark


On Tue, Dec 1, 2009 at 09:02, Peter Constable <petercon <at> microsoft.com> wrote:
From: ietf-languages-bounces <at> alvestrand.no [mailto:ietf-languages-bounces <at> alvestrand.no] On Behalf Of Doug Ewell

> My concern is with converting the existing "Latvian" to a macrolanguage,
> which implies that the term "Latvian" sometimes refers just to Standard
> Latvian and sometimes to both Standard Latvian and Latgalian.

What you are describing is an issue of terminology and documentation: what do people mean when using the term "Latvian". But what we are in fact dealing with is a _coding_ issue: how has "lv" / "lav" been used in implementations, and what existing content is tagged "lv" / "lav"?

There is clear evidence on the coding issue: MARC has used "lav" for Latgalian for some time.

If the denotation of "lav" were changed to explicitly exclude Latgalian (which would be necessary if its scope is not set to macrolanguage), then an unknown number of librarians will have broken data. It would be irresponsible of the ISO 639-RA/JAC to do such a thing, IMO.



Peter
_______________________________________________
Ietf-languages mailing list
Ietf-languages <at> alvestrand.no
http://www.alvestrand.no/mailman/listinfo/ietf-languages

_______________________________________________
Ietf-languages mailing list
Ietf-languages <at> alvestrand.no
http://www.alvestrand.no/mailman/listinfo/ietf-languages
John Cowan | 1 Dec 19:41

Re: Criteria for languages?

Mark Davis â?? scripsit:

> I never got an answer as to the relevant difference between the
> Latvian case and the Swiss German case that would cause one to be a
> macrolanguage, and the other to be simply a split. Is this the key
> factor between them, that MARC has been using lav for Latgalian,
> and it hasn't been using gsw for Walliserdeutsch.

For "MARC" read "someone whose policies we know about".

--

-- 
My confusion is rapidly waxing          John Cowan
For XML Schema's too taxing:            cowan <at> ccil.org
    I'd use DTDs                        http://www.ccil.org/~cowan
    If they had local trees --
I think I best switch to RELAX NG.
_______________________________________________
Ietf-languages mailing list
Ietf-languages <at> alvestrand.no
http://www.alvestrand.no/mailman/listinfo/ietf-languages
Michael Everson | 1 Dec 19:51
Favicon
Gravatar

Re: Consensus Call on Latin Sharp S and Greek Final Sigma

On 1 Dec 2009, at 04:30, Vint Cerf wrote:

> one reason for making the characters PVALID is to allow registries  
> to bundle registrations if they wish to achieve the same effect as  
> mapping, but without the mapping.

 From a character point of view I would favour this. It is simply the  
case that ß is not ss any more than þ is th or ð is dh.

On 1 Dec 2009, at 11:24, John C Klensin wrote:

> Please remember that, for many years, existing users needing ö (o- 
> umlaut in the German context) used "oe", so carrying this
> argument very far would rapidly turn into an argument against any  
> IDNs involving decorated Latin characters.

Indeed.

> Long after registrations involving "ö" started to be permitted, many  
> users continued to use the "oe" forms.  The mapping from one to the  
> other was never, as far as I know, embedded in code, but, from a  
> user standpoint, the situation is much the same: some strings  
> containing "oe" are interchangeable with the corresponding strings  
> with "ö" and, if both are not registered to the same entity and  
> either bundled or redirected, the user somehow needs to keep track  
> of what is registered and pick the correct name.

Quite so.

> And someone guessing the domain name for Herr Möller would have had  
> to know to guess "moeller" before IDNs were introduced and which one  
> to try now that they are available.

And now has "mœller" and "møller". :-)

> "ß" isn't a display form for "ss" any more than "ö" is a display  
> form for "oe".

John is quite correct here.

> They are different characters. One could plausibly consider them  
> display forms if "ß" could be substituted for every instance of "ss"  
> or "ö" could reasonable be substituted for every instance of "oe",  
> but that is not the case: Göthestraße is just wrong, not a display  
> variation of Goethestrasse.

And Eisstrasse might conceivably be Eisstraße, but it is certainly not  
Eißtrasse.

Michael Everson * http://www.evertype.com/
John Cowan | 1 Dec 19:53

Re: Criteria for languages?

Michael Everson scripsit:

> I do not believe that, erm, some "Lettish" is a "macrolanguage"  
> including Latvian and Latgalian.

Nevertheless, when people say "Latvian" (or "lav"), sometimes the
denotation includes only standard Latvian, and sometimes it includes
both standard Latvian and standard Latgalian.  

> If librarians tagged some Latgalian data with "the nearest thing" that  
> still does not mean that all Latvian-language data is broken.

No one suggests that it is.  Standard Latvian data can correctly continue
to be tagged "lav".

> If "lav" becomes a macrolanguage then to be precise all Latvian books
> will have to be re-tagged.

To be *maximally* precise, yes.  But maximal precision isn't always
desirable (or even available).  It often costs too much to achieve.
It's a lot easier to determine that the average price of a house in
Co. Mayo is about 25,000 euros than to get a figure correct down
to the last cent.

--

-- 
John Cowan  cowan <at> ccil.org  http://ccil.org/~cowan
Female celebrity stalker, on a hot morning in Cairo:
"Imagine, Colonel Lawrence, ninety-two already!"
El Auruns's reply:  "Many happy returns of the day!"
ISO639-3 | 1 Dec 18:47
Favicon

Re: Criteria for languages?


I think you are looking at one kind of tagging. There are other applications of this code set.

There are dozens of examples in the Library of Congress cataloging records which specifically state that the content is Latgalian, but the tag in the fixed length fields (008, 041) is 'lav'. Furthermore, their documentation (the MARC Code List for Languages) states that 'lav' should be used in this field when the content is in Latgalian.

That is the primary reason that the macrolanguage has been proposed. It was the point that Peter Constable was making.

This is not the case with Walliser, as far as I can tell, and that is why the proposals are structured differently.

Joan Spanne
ISO 639-3/RA
SIL International
7500 W Camp Wisdom Rd
Dallas, TX 75236
ISO639-3 <at> sil.org


"Doug Ewell" <doug <at> ewellic.org>
Sent by: ietf-languages-bounces <at> alvestrand.no

2009-11-30 07:19 PM

To
<ietf-languages <at> iana.org>
cc
Subject
Re: Criteria for languages?





CE Whitehead <cewcathar at hotmail dot com> wrote:

> To me, since Latgalia n has its own distinct writing system, and its
> own literature (thus meeting the criteria for a separate language at
> ethnologue;
> http://www.ethnologue.com/ethno_docs/introduction.asp#language_id),
> I do not see any problem with its getting its own language subtag.

For my part, at least, I have no problem with the idea that Latgalian
should get its own language subtag if experts feel it is a distinct
language.  My concern is with converting the existing "Latvian" to a
macrolanguage, which implies that the term "Latvian" sometimes refers
just to Standard Latvian and sometimes to both Standard Latvian and
Latgalian.  I don't necessarily get the impression that the latter is
true; it seems that when people mean Latgalian, they say "Latgalian."

--
Doug Ewell  |  Thornton, Colorado, USA  |  http://www.ewellic.org
RFC 5645, 4645, UTN #14  |  ietf-languages <at> http://is.gd/2kf0s ­

_______________________________________________
Ietf-languages mailing list
Ietf-languages <at> alvestrand.no
http://www.alvestrand.no/mailman/listinfo/ietf-languages

_______________________________________________
Ietf-languages mailing list
Ietf-languages <at> alvestrand.no
http://www.alvestrand.no/mailman/listinfo/ietf-languages

Gmane