[Talk-in] Automating OSM translation into Indic languages
srik.lak at gmail.com
Wed Apr 8 02:49:08 UTC 2015
I modified the script little and got a list of places with their translated
names. Gist and csv data files. Change the bbox / query params to get
Observations on the data:-
1. Wikidata querying doesnt honour redirects. So Bengaluru, Mysuru, et al
dont get results from wikidata. We probably need to use wikipedia API, see
if its a redirect, use the redirected page and get the translated name. I
was too lazy for 1st pass.
2. Need for manual verification :-
A. Place names can be similar to other verbs in English and we might have
got result of the verb's translation through wiki data. Ex :- kama,en
| काम,hi or might contain extra disambiguation terms which might not be
required in map. Anuradhapura,ml,അനുരാധപുരം (നഗരം)
B. Place might be only known for a thing for which wikipedia article is
created and interwiki linked. This is actually wikipedia's problem, but we
need to carry it on since we use them. Ex:- Kalasa,en |
Kalasa,kn,Kalgundi Sri Marulasidheshwara swami temple.
C. OSM data itself contains names in non latin script in name tags. I didnt
see much for Indian towns / cities, but it is the case for many Bangladesh
/ Nepal towns. Is there a discussion about which language should be used in
name tag? Needless to say this script cannot get indic names as English
wiki will not have pages in latin script.
I agree with Sajjad that this is going to be a tedious manual task(~6000
strings to look up) and we need a good web interface. I am looking at
crowdcrafting / pybossa. But if there can be a custom webapp built which
can directly upload change to OSM, nothing like it.
On Tue, Apr 7, 2015 at 11:32 AM, Sajjad Anwar <me at sajjad.in> wrote:
> This is great.
> Aruna, we can use the Wikidata to get a first pass of the translation and
> then present the spreadsheet view for someone to eyeball and add missing?
> On Tue, Apr 7, 2015 at 10:59 AM, Aruna S <safincrazy at gmail.com> wrote:
>> On Mon, Apr 6, 2015 at 2:40 PM, Srikanth Lakshmanan <srik.lak at gmail.com>
>>> Great work, I have been thinking this for sometime. I am of the opinion
>>> that place names(towns / villages etc) should be translated and not
>>> transliterated. Arun has a point about locality address as people might be
>>> so used to English, that they find translations in their own language
>>> For place names, would it be a good idea to run a script which can look
>>> up wikidata, extract names in multiple language and update OSM? Below is a
>>> sample query for 'Bangalore' in multiple languages.
>> This seems like a wonderful idea. I'll use this while working on the
>> translation. Thanks. :)
>> Talk-in mailing list
>> Talk-in at openstreetmap.org
> Sajjad Anwar http://geohacker.in <http://sajjad.in/>
> Talk-in mailing list
> Talk-in at openstreetmap.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Talk-in