[OSM-talk] Proposed mechanical edit to convert alt_name tags
andrew.r.buck at gmail.com
Mon Sep 8 18:52:17 UTC 2014
-----BEGIN PGP SIGNED MESSAGE-----
Hello everyone. Recently HOT has been responding to the ebola
outbreak in western Africa (specifically the countries of Guinea,
Liberia, and Sierra Leone). As a part of this response we have
converted the GNS name files containing populated place names from
their original csv format to OSM format using a python script. The
files have then been broken into tiles corresponding to task manager
tasks and are being manually checked against imagery, topo maps, and
existing OSM data before being finally merged into OSM one by one.
The task manager jobs are being carried out by a hand-picked team of
people using private task manager jobs so that the work is done
carefully and no one just "blindly" dumps a load of data in without
first checking it.
However, since the conversion scripts have changed over time and since
some work was done manually by others, we ended up using two slightly
different forms for the alt_name field in the case where there are
more than one alt_name for a given place. Some use the form
alt_name_2 and some use alt_name:2 (with the 2 extending to higher
numbers as well, some of these places have as many as 6 or 7 alt_name
After consultation with twain47 regarding which format would be
preferred for nominatim, it was decided that the alt_name_2 was the
preferred format (the :2 format gets confused with the language
variants, so the underscore version is preferable). Twain47 has
already written a patch for nominatim to make this work, and the pull
request and push to the live server is pending. Once the new code is
in place and shown to be working, my plan was to do a mechanical edit
to convert all of the existing alt_name:N tags to their corresponding
alt_name_N equivalents. There are also a few similar variants like
alt_name2 which would be converted at the same time. All in all it
looks like about 500 or so objects worldwide are tagged using one form
or another of the deprecated formats.
I can either limit this conversion to the areas where HOT has been
doing this kind of work, or I could extend it worldwide. My suspicion
is that after the ones from the HOT areas have been converted there
will likely be only a few elsewhere anyway, so it probably makes sense
to just do them worldwide anyway.
Are there any objections, or suggestions on how to do this better?
And finally, if it is decided to carry out this action, is there
someone who would be willing to actually carry out the change
themselves with a mechanical edit account or bot or something, or
should I do it myself in josm?
Thank you for your consideration in this matter, and I would ask that
we try to expedite the discussion on this (assuming no major
objections are raised) as there are external organizations using our
data to respond to the ebola epidemic, and the quicker this can be
sorted out the less painful it will be for them.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
-----END PGP SIGNATURE-----
More information about the talk