RFC: Names localization

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
29 messages Options
12
Reply | Threaded
Open this post in threaded view
|

RFC: Names localization

"Petr Morávek [Xificurk]"
Hello,

I've summarized [1] the ideas that were recently discussed in talk@
regarding the names, their different language mutations, ...

I would like to hear some comments, additional pros/cons I could not
think of myself, etc.

Although I was arguing for the "don't repeat yourself" solution, I can
see that it has its drawback in that it's not that intuitive. So right
now, personally, I'm not really sure which solution is better.


Best regards,
Petr Morávek aka Xificurk

PS: CC to talk@ because the ideas were born there, but I kindly ask you
to send any responses directly to tagging@.

[1] http://wiki.openstreetmap.org/wiki/Proposed_features/Names_localization


_______________________________________________
Tagging mailing list
[hidden email]
http://lists.openstreetmap.org/listinfo/tagging

signature.asc (270 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Names localization

John Sturdy
> [1] http://wiki.openstreetmap.org/wiki/Proposed_features/Names_localization

+1, generally; but I'm not keen on deprecating the bare "name=*" tag,
because for many (perhaps most) named features, there is only one
name.  For example, a minor rural road in England will probably have a
name (in English), but it won't have names in other languages, and
no-one will really describe its name as "its English name" --- it's
simply "its name".  Multiple names are really an issue for
multilingual countries and for major features (typically large cities,
rivers, and perhaps mountains) in monolingual countries, and I suspect
those are well under half of all the features that will ever be
mapped.

Having just suggested keeping it simple, I'll suggest a complication
as well: multiple scripts for the same language.  In particular, I'm
thinking of mainland China, as it opens up more to interaction with
"the West"; and, when I did an introductory course on Chinese language
and culture, my teacher said the Chinese people begin learning to read
and write using pinyin, rather than in Chinese script, so maybe we
should ask Chinese mappers whether they're interested in it being
convenient to have names in both.

__John

_______________________________________________
Tagging mailing list
[hidden email]
http://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Names localization

Glom
In reply to this post by "Petr Morávek [Xificurk]"
Petr Morávek [Xificurk] <xificurk@...> writes:
>
> [1] http://wiki.openstreetmap.org/wiki/Proposed_features/Names_localization
>

OK, so if I understand this right

lang=<language_code> is supposed to tell what languages that are used in the
tag name=<place_name>

May I propose to use lang:name=<language_code> instead of lang=<language_code>
(or is it name:lang=<language_code>)

Then the key "lang:" could be used even if there happens to be more tags that
need its language stated.

By the way, is it only meant as an internal OSM-thing or is it supposed to
also be a mapping of official languages in the place (or official languages
expected on road signs)?


_______________________________________________
Tagging mailing list
[hidden email]
http://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Names localization

"Petr Morávek [Xificurk]"
Johan Jönsson wrote:
> lang=<language_code> is supposed to tell what languages that are used in the
> tag name=<place_name>
>
> May I propose to use lang:name=<language_code> instead of lang=<language_code>
> (or is it name:lang=<language_code>)

I don't like name:lang simply because it conflicts with the established
scheme for tagging names in different languages, e.g. name:en="London".

> Then the key "lang:" could be used even if there happens to be more tags that
> need its language stated.

lang:name="en" might make sense, but do you have an example where this
would be useful?

> By the way, is it only meant as an internal OSM-thing or is it supposed to
> also be a mapping of official languages in the place (or official languages
> expected on road signs)?

Could you provide an example, where those two are different?
The proposal was primarily meant to fix the unclear meaning of bare name
tag, but it's still just the first draft.

Best regards,
Petr Morávek aka Xificurk


_______________________________________________
Tagging mailing list
[hidden email]
http://lists.openstreetmap.org/listinfo/tagging

signature.asc (270 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Names localization

Glom
Petr Morávek [Xificurk] <xificurk@...> writes:
>
> Johan Jönsson wrote:
>
> > By the way, is it only meant as an internal OSM-thing or is it supposed to
> > also be a mapping of official languages in the place (or official
languages
> > expected on road signs)?
>
> Could you provide an example, where those two are different?
> The proposal was primarily meant to fix the unclear meaning of bare name
> tag, but it's still just the first draft.
>

Sorry if I am getting to theoretical on the subject of how to write tags.

I was wondering about the reason for this tag,
*is it to explain the languages in the tag name:
(if, like in your bruxelles-brussel example, is two names I guess that the
order is important)
*or is it aimed at noting information from wikipedia on the official languages
of this place (probably ordered after number of speakers but with
administrative language first or something).
*It could also be meant to explain something that might not exist on
wikipedia, in what languages and scripts the road signs usually are on the
place. In the greece capital Athens there are usually the name in greek
letters first and then in roman letters (gr and gr_rom maybe).

I do not say that these things generally differ much, I just say that which of
these that is supposed to be tagged could be good to know.

p.s.
If we leave the cities I could think of a nice example.
A pub or maybe camping place where they have a sign outside telling what
languages the staff speaks, seen these on swedish camping places.
d.s.





_______________________________________________
Tagging mailing list
[hidden email]
http://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Names localization

Andrew Errington
In reply to this post by John Sturdy
On Wed, 01 Aug 2012 19:48:37 John Sturdy wrote:

> > [1]
> > http://wiki.openstreetmap.org/wiki/Proposed_features/Names_localization
>
> +1, generally; but I'm not keen on deprecating the bare "name=*" tag,
> because for many (perhaps most) named features, there is only one
> name.  For example, a minor rural road in England will probably have a
> name (in English), but it won't have names in other languages, and
> no-one will really describe its name as "its English name" --- it's
> simply "its name".  Multiple names are really an issue for
> multilingual countries and for major features (typically large cities,
> rivers, and perhaps mountains) in monolingual countries, and I suspect
> those are well under half of all the features that will ever be
> mapped.

I like the proposal in general, but I don't think it's necessary to introduce
a lang=* tag, and I don't think we should lose the name=* tag.

It's also not true that in a 'monolingual' country that there is only one name
for something.  For example, London is 'London' to a British person,
but 'Londres' to a French person.

I still think it's simple enough to have name=* to be the 'default' name you
get if you don't specify a language, or the name you get if your selected
language is not available.

For example, for London:
name=London
name:en=London
name:fr=Londres

Then a person requesting a French version of the map would see 'Londres', but
a person requesting the German version would see 'London'.  If no language is
specified a person would see 'London'.

A simple algorithm can also make bilingual maps by concatenating tags,
i.e. "name:xx (name:yy)" and making sensible decisions if name:xx=* or
name:yy=* is missing, or if name=* contains name:xx=* or name:yy=* (such as
in Japan or Korea where name=* contains Japanese (or Korean) followed by
English in brackets).

Best wishes,

Andrew

_______________________________________________
Tagging mailing list
[hidden email]
http://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Names localization

Richard Fairhurst
Andrew Errington wrote:
> It's also not true that in a 'monolingual' country that there is only one
> name for something.  For example, London is 'London' to a British person

*ahem* It's "Llundain" in one of Britain's two official languages.

cheers
Richard

Reply | Threaded
Open this post in threaded view
|

Re: RFC: Names localization

John Sturdy
In reply to this post by Andrew Errington
On Wed, Aug 1, 2012 at 10:28 PM, Andrew Errington <[hidden email]> wrote:
> On Wed, 01 Aug 2012 19:48:37 John Sturdy wrote:

> It's also not true that in a 'monolingual' country that there is only one name
> for something.  For example, London is 'London' to a British person,
> but 'Londres' to a French person.

This is what I was referring to in the second part of this sentence:

>> Multiple names are really an issue for
>> multilingual countries and for major features (typically large cities,
>> rivers, and perhaps mountains) in monolingual countries,

i.e. London may be "London" to an English person and "Londres" to a
French person, but Stourport-on-Severn is "Stourport-on-Severn" to
both of them (just picking a smallish town randomly; no potential slur
intended).  And a lot of names in OSM are street names; as far as I
know, it's rare for people from another country to have different
names for a country's streets.  (I thought I had found one example,
"Via Devana" as the Latin name for Huntingdon Road, Cambridge, but
when I searched to check that, I found it's 18th-century Latin and not
actually Roman.)

> I still think it's simple enough to have name=* to be the 'default' name you
> get if you don't specify a language, or the name you get if your selected
> language is not available.

Agreed.

__John

_______________________________________________
Tagging mailing list
[hidden email]
http://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Names localization

SomeoneElse
In reply to this post by Richard Fairhurst
Richard Fairhurst wrote:
>
> *ahem* It's "Llundain" in one of Britain's two official languages.

Two?  You could make a case for both Irish and Ulster-Scots as well,
based on the Anglo-Irish Agreement:

http://www.nio.gov.uk/agreement.pdf

:)



_______________________________________________
Tagging mailing list
[hidden email]
http://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Names localization

Andrew Errington
In reply to this post by John Sturdy
On Thu, 02 Aug 2012 07:31:40 John Sturdy wrote:
<snip>
> i.e. London may be "London" to an English person and "Londres" to a
> French person, but Stourport-on-Severn is "Stourport-on-Severn" to
> both of them (just picking a smallish town randomly; no potential slur
> intended).  And a lot of names in OSM are street names; as far as I
> know, it's rare for people from another country to have different
> names for a country's streets.

Absolutely, but the point is whatever mechanism we implement for name=* (and
name:xx=*) must work for anything that can be tagged with name=*.  If it's
not needed for something, then it is simply not used.

And actually... in Korea we do have different names for streets, one in Korean
(in Hangul) and one in English.

Best wishes,

Andrew

_______________________________________________
Tagging mailing list
[hidden email]
http://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Names localization

"Petr Morávek [Xificurk]"
In reply to this post by Glom
Johan Jönsson wrote:
> Sorry if I am getting to theoretical on the subject of how to write tags.
>
> I was wondering about the reason for this tag,
> *is it to explain the languages in the tag name:
> (if, like in your bruxelles-brussel example, is two names I guess that the
> order is important)

Yes, this was the primary motivation for the proposal.

> *or is it aimed at noting information from wikipedia on the official languages
> of this place (probably ordered after number of speakers but with
> administrative language first or something).

AFAIK, official languages are usually decided on the level of
countries... even if you know the official language of the country, you
can still come across a small region, where
* most people don't even speak it official language properly
* the road signs are written in the official language, supplemented by
the language of local minority

> *It could also be meant to explain something that might not exist on
> wikipedia, in what languages and scripts the road signs usually are on the
> place. In the greece capital Athens there are usually the name in greek
> letters first and then in roman letters (gr and gr_rom maybe).

This is actually a related problem - the question, what should we
generally put in name tag? IMHO in case of a dispute, a reasonable
solution is "on-the-ground" rule. But if everybody in Greece agrees that
in name tag they will put only names in Greek alphabet (even though you
say the signs contain latin transcription as well), it's their call.

Best regards,
Petr Morávek aka Xificurk


_______________________________________________
Tagging mailing list
[hidden email]
http://lists.openstreetmap.org/listinfo/tagging

signature.asc (270 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Names localization

kmilos
In reply to this post by Glom
Johan Jönsson <johan.j@...> writes:
> *It could also be meant to explain something that might not exist on
> wikipedia, in what languages and scripts the road signs usually are on the
> place. In the greece capital Athens there are usually the name in greek
> letters first and then in roman letters (gr and gr_rom maybe).
>

I would really like the OSM to stop the practice of inventing arbitrary language
tags like this and previous ones (ko_ro - Korean as spoken in Romania, really?).

Let's please start improving the OSM i18n situation by at least following BCP 47:

http://en.wikipedia.org/wiki/IETF_language_tag

Thanks,
M


_______________________________________________
Tagging mailing list
[hidden email]
http://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Names localization

kmilos
In reply to this post by "Petr Morávek [Xificurk]"
Petr Morávek [Xificurk] <xificurk@...> writes:

> Johan Jönsson wrote:
> > *It could also be meant to explain something that might not exist on
> > wikipedia, in what languages and scripts the road signs usually are on the
> > place. In the greece capital Athens there are usually the name in greek
> > letters first and then in roman letters (gr and gr_rom maybe).
>
> This is actually a related problem - the question, what should we
> generally put in name tag? IMHO in case of a dispute, a reasonable
> solution is "on-the-ground" rule. But if everybody in Greece agrees that
> in name tag they will put only names in Greek alphabet (even though you
> say the signs contain latin transcription as well), it's their call.
>

Exactly, and fits perfectly with the proposed model: in theory there is no
difference between multilingual and multi-script content. These should be viewed
as just as separate _locales_ (set of parameters that are needed to describe
_written_ information).

So you could have

name:el=Αθήνα
lang=el;el-Latn

and have the renderer (or whichever client) automatically transliterate the
requested 'el-Latn' content from the available 'el' one to populate 'name=Αθήνα
(Athína)'.

If you only had

name=Αθήνα

the client wouldn't have a clue that the content is in Greek and wouldn't know
how to process it further.

name=* without any context of what language is recorded in it is one of the
biggest fallacies of OSM i18n and needs to be addressed.

The proposed scheme is one way, the other is mandating that name=* is actually
name:en=* on the _whole planet_.

Having "whatever default" in it just doesn't work any more.

M


_______________________________________________
Tagging mailing list
[hidden email]
http://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Names localization

Tordanik
On 02.08.2012 12:56, MilošKomarčević wrote:
> name=* without any context of what language is recorded in it is one of the
> biggest fallacies of OSM i18n and needs to be addressed.

You need to realize, though, that mappers in areas where only one
language is commonly used will not want to put more effort into mapping
names than they do today. And rightly so, imo - from their perspective,
it's just more work for little or no gain.

Thus, there is a fundamental requirement for any future tagging scheme
for names: In areas with a single main language, _one_ tag needs to be
enough for a name in that language.
Preferably, the key for this case should remain "name".

Setting some additional tags at the boundary of that area for clarity
what "name" means there is fine, but there must not be any additional
effort for setting names on the individual objects.

Tobias

_______________________________________________
Tagging mailing list
[hidden email]
http://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Names localization

"Petr Morávek [Xificurk]"
Tobias Knerr wrote:
> On 02.08.2012 12:56, MilošKomarčević wrote:
>> name=* without any context of what language is recorded in it is one of the
>> biggest fallacies of OSM i18n and needs to be addressed.
>
> You need to realize, though, that mappers in areas where only one
> language is commonly used will not want to put more effort into mapping
> names than they do today. And rightly so, imo - from their perspective,
> it's just more work for little or no gain.

Yes, I agree. This is very strong argument for Option 1 and I'm starting
to lean towards this solution.
I won't touch to page for week or so, but unless someone comes with a
strong counter-arguments (or completely different better proposal), I
think we should refine the option 1 and build on top of this basic idea.

Best regards,
Petr Morávek aka Xificurk


_______________________________________________
Tagging mailing list
[hidden email]
http://lists.openstreetmap.org/listinfo/tagging

signature.asc (270 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Names localization

kmilos
In reply to this post by Tordanik
Tobias Knerr <osm@...> writes:
> On 02.08.2012 12:56, MilošKomarčević wrote:
> > name=* without any context of what language is recorded in it is one of the
> > biggest fallacies of OSM i18n and needs to be addressed.
>
> You need to realize, though, that mappers in areas where only one
> language is commonly used will not want to put more effort into mapping
> names than they do today. And rightly so, imo - from their perspective,
> it's just more work for little or no gain.

Sure. Was just stating the root of the problem, probably brought on by
architects with little i18n experience who probably assumed only one
language/script is used in an area (or what they though of as most areas). It
might have made sense 'from their perspective', but they created a bit of mess
for a lot of upcoming and very large and populous multicultural areas (take
India for example), not to mention smaller ones all over the world. Saying it
was for no gain is a bit short-sighted and selfish, no?
 
> Thus, there is a fundamental requirement for any future tagging scheme
> for names: In areas with a single main language, _one_ tag needs to be
> enough for a name in that language.

Agreed.

> Preferably, the key for this case should remain "name".

I don't see a problem of mandating name:xx even when only one language is used
for added clarity, and have a bot fix up existing ones. Does break backwards
compatibility though, so too late to fix at this point.
 
> Setting some additional tags at the boundary of that area for clarity
> what "name" means there is fine, but there must not be any additional
> effort for setting names on the individual objects.

Totally agree, and is probably the best way that gives us both the fix of the
problem and keeps backward compatibility and least impact to mappers.

M


_______________________________________________
Tagging mailing list
[hidden email]
http://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Names localization

Tordanik
In reply to this post by "Petr Morávek [Xificurk]"
"Petr Morávek [Xificurk]" wrote:
> Tobias Knerr wrote:
>> You need to realize, though, that mappers in areas where only one
>> language is commonly used will not want to put more effort into mapping
>> names than they do today. And rightly so, imo - from their perspective,
>> it's just more work for little or no gain.
>
> Yes, I agree. This is very strong argument for Option 1 and I'm starting
> to lean towards this solution.

Have you considered combining the options?

For example you could use option 2 with a single additional rule: If
lang contains only one language code, treat name as name:<lang_value>.

So if there is only one main language, lang will contain the code for
that language, and name will contain the name in that language.

But in multilingual areas, lang contains the codes for all these
languages as per option 2, and once mappers in those areas trust data
consumers to construct the labels from several name:xx reliably, they
can begin omitting the bare name tag.

Tobias

_______________________________________________
Tagging mailing list
[hidden email]
http://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Names localization

Philip Barnes
In reply to this post by kmilos
On Thu, 2012-08-02 at 11:42 +0000, MilošKomarčević wrote:

> Tobias Knerr <osm@...> writes:
> > On 02.08.2012 12:56, MilošKomarčević wrote:
> > > name=* without any context of what language is recorded in it is one of the
> > > biggest fallacies of OSM i18n and needs to be addressed.
> >
> > You need to realize, though, that mappers in areas where only one
> > language is commonly used will not want to put more effort into mapping
> > names than they do today. And rightly so, imo - from their perspective,
> > it's just more work for little or no gain.
>
> Sure. Was just stating the root of the problem, probably brought on by
> architects with little i18n experience who probably assumed only one
> language/script is used in an area (or what they though of as most areas). It
> might have made sense 'from their perspective', but they created a bit of mess
> for a lot of upcoming and very large and populous multicultural areas (take
> India for example), not to mention smaller ones all over the world. Saying it
> was for no gain is a bit short-sighted and selfish, no?
>  
> > Thus, there is a fundamental requirement for any future tagging scheme
> > for names: In areas with a single main language, _one_ tag needs to be
> > enough for a name in that language.
>
> Agreed.
>
> > Preferably, the key for this case should remain "name".
>
> I don't see a problem of mandating name:xx even when only one language is used
> for added clarity, and have a bot fix up existing ones. Does break backwards
> compatibility though, so too late to fix at this point.
IMO where there is only one name on a sign, then the name should remain
a valid tag.

Please no bots on this, as a cross-border mapper I can only see this
ending in tears.

In Wales, some roads are named in Welsh, some English. I see no problem
in that, if there is one name then that should remain the name. A bot
really can't be applied here, it first of all has to decide which
language a name is in, and then get the tag right. how would it deal
with mixed names? Such as Llangollen Road.

Welsh place names also drift over the border into Shropshire, where
should the border be drawn?

Where multiple names exist on signs, larger towns and cities, mappers
have already tagged both names, as in name:en name:cy. If there is only
one name on the sign, then there should only be one name on the map.
This should then be tagged as name, as language cannot be assumed.

Even place names in England are not always English, there is a mix of
Anglo Saxon, Dane and Norman in there too.

Phil


_______________________________________________
Tagging mailing list
[hidden email]
http://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Names localization

LM_1
In reply to this post by Tordanik
Let's  not forget that this debate was started by naming disputes in Ukraine.
I would vote for option 2 myself, but if that would be found
impossible, I could agree with Tobias.
LM

2012/8/2 Tobias Knerr <[hidden email]>:

> "Petr Morávek [Xificurk]" wrote:
>> Tobias Knerr wrote:
>>> You need to realize, though, that mappers in areas where only one
>>> language is commonly used will not want to put more effort into mapping
>>> names than they do today. And rightly so, imo - from their perspective,
>>> it's just more work for little or no gain.
>>
>> Yes, I agree. This is very strong argument for Option 1 and I'm starting
>> to lean towards this solution.
>
> Have you considered combining the options?
>
> For example you could use option 2 with a single additional rule: If
> lang contains only one language code, treat name as name:<lang_value>.
>
> So if there is only one main language, lang will contain the code for
> that language, and name will contain the name in that language.
>
> But in multilingual areas, lang contains the codes for all these
> languages as per option 2, and once mappers in those areas trust data
> consumers to construct the labels from several name:xx reliably, they
> can begin omitting the bare name tag.
>
> Tobias
>
> _______________________________________________
> Tagging mailing list
> [hidden email]
> http://lists.openstreetmap.org/listinfo/tagging

_______________________________________________
Tagging mailing list
[hidden email]
http://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Names localization

kmilos
In reply to this post by Philip Barnes
On Thu, Aug 2, 2012 at 1:23 PM, Philip Barnes <[hidden email]> wrote:
>
> In Wales, some roads are named in Welsh, some English. I see no problem
> in that, if there is one name then that should remain the name. A bot
> really can't be applied here, it first of all has to decide which
> language a name is in, and then get the tag right. how would it deal
> with mixed names? Such as Llangollen Road.
>

Perfectly illustrates the false assumption name=* was conceived on.

Now there's no way to know which is which and how to clean up and
separate the data. Not saying you need to in this case, but someone
might want to in the future, or in a different area where e.g. the two
languages in question are written in different scripts for example.

M

_______________________________________________
Tagging mailing list
[hidden email]
http://lists.openstreetmap.org/listinfo/tagging
12