Thanks to all who replied on
this question with suggestions, additional questions, and pointers.
I will try (to find the time) to
get an answer from BBC on their approach and intentions, and also to get some
feedback from someone familiar with Kinyarwanda and Kirundi. This sort of
situation is one that I think is potential with a number of languages (per some
past threads), and that in such cases, the idea of a clear-cut single language
definition and/or audience for page content may not hold. More information on
such situations is will certainly become available as more web content in
diverse languages is created.
As for requesting macrolanguage
codes, that is another level, but obviously one to keep in mind. I think it is
viable in many circumstances, but in others it may be difficult to make the
case. The ad hoc way that ISO 639 evolved, however, means that there are similar
cases of related tongues that are sometimes given a common code (interpreted
after the fact as macrolanguage) and sometimes not. I think that developments
such as more web content in diverse languages and efforts such as the locales
sub-project of ANLoc (African Network for Localisation) have the potential to
highlight such issues.
Thanks again and all the best.
From: Peter Constable
[mailto:petercon <at> microsoft.com]
Sent: Wednesday, April 08, 2009 11:12 PM
To: Phillips, Addison; Don Osborn; 'LTRU Working Group'; 'IETF Languages
Subject: RE: [Ltru] How to handle macrolanguage when no code?
If it is content in one
linguistic variety and crafted to serve two audiences deemed in 639-3 to be
distinct languages, then that strikes me as a potential macrolanguage scenario.
One key question is how narrow a
scope of content is needed and how much deliberate effort is needed to craft
something like that. For instance, a document consisting of “Papa!” can serve
many different audiences, but that is solely because the scope of content is so
constrained, and for that reason the bar is not met for a macrolanguage. But if
it’s easy for a content provider to come up with content that serves both, then
Another key question is why that
content is functional for both audiences. Is it because it is expressed in a
variety that can really be considered common, or is it because it’s actually in
language A and 90% of speakers in language B are functionally bilingual in A?
Does the common-identify label reflect actual linguistic commonality, or is it
a logistic tool used in the repository to reflect merely a dual tasking?
Some thoughts. Discuss it with
the 639-3 RA.
ltru-bounces <at> ietf.org [mailto:ltru-bounces <at> ietf.org] On Behalf Of Phillips,
Sent: Wednesday, April 08, 2009 5:53 PM
To: Don Osborn; 'LTRU Working Group'; 'IETF Languages Discussion'
Subject: Re: [Ltru] How to handle macrolanguage when no code?
HTML certainly allows you to
declare that some content is applicable to more than one language audience.
Otherwise, John Cowan’s advice
seems appropriate… ISO 639-3 or ISO 639-5 would be your next stop. Note that
macrolanguages are sometimes problematical, so you might also consider a
collection code instead.
Globalization Architect -- Lab126
Internationalization is not a feature.
It is an architecture.
From: ltru-bounces <at> ietf.org
[mailto:ltru-bounces <at> ietf.org] On Behalf Of Don Osborn
Sent: Wednesday, April 08, 2009 4:40 PM
To: 'LTRU Working Group'; 'IETF Languages Discussion'
Subject: [Ltru] How to handle macrolanguage when no code?
In looking at the BBC website's offerings in African
languages, one notes that they have grouped Kinyarwanda and Kirundi together
. This makes sense from a linguistic point of view since as I understand it,
the two languages are almost the same. When looking at the view (page) source,
one notes that they use lang="rw" (for Kinyarwanda). It may be that
the pages I checked are properly Kinyarwanda and an expert would know that they
are not Kirundi (rn), but it is in any event true that there is no code element
to cover both languages.
I'm curious if there is any other recommended way to handle
such a situation where web content may be deliberately and easily designed to cover
more than one language as defined by ISO 639 when there is not currently any
macrolanguage code for them. Could one for example define a whole page as
having two languages? E.g., something like lang="rw, rn"?
Thanks in advance for any feedback.