<div dir="ltr"><div><div>Hi Andrew,<br><br>I agree taht pairing the OSM Ids will be the most simple and robust solution.<br>Regarding the Pcodes methodology, seems an old method (that was used eg in Haiti) has been abandonned (or you did not mention it), whas is really fortunate: concatenate admin level IDs (every admin level having an integer number between 1 and x). Seems practical at first glance, but a real nightmare as admin boundaries can change. Basically this system created unique IDs but not for unique objects. And in Haiti, where admin level had been reorganized by the government a few years before the Earthquake, GIS people were more fighting against the pcodes than able to use them. <br><br></div>Sincerely,<br><br></div>Severin<br><div><div><div><br><br><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Date: Wed, 24 Sep 2014 12:54:12 -0500<br>
From: Andrew Buck <<a href="mailto:andrew.r.buck@gmail.com">andrew.r.buck@gmail.com</a>><br>
To: "Imports OpenStreetMap.org" <<a href="mailto:imports@openstreetmap.org">imports@openstreetmap.org</a>><br>
Subject: [Imports] Adding pcodes to villages in Liberia, Sierra Leone<br>
and Guinea<br>
Message-ID: <<a href="mailto:54230544.6040007@gmail.com">54230544.6040007@gmail.com</a>><br>
Content-Type: text/plain; charset=ISO-8859-1<br>
<br>
-----BEGIN PGP SIGNED MESSAGE-----<br>
Hash: SHA1<br>
<br>
Hello everyone. As you are probably aware HOT has been working to map<br>
the areas affected by Ebola in west Africa and to help humanitarian<br>
organizations better use OSM data in their efforts there. Because OSM<br>
has the best dataset of settlements (towns, villages, etc) in the area<br>
several prominent groups have chosen to standardize on OSM being their<br>
official source for place names and locations.<br>
<br>
Due to the issue that many place names in Africa (and elsewhere) have<br>
many different spellings (due to the local languages not using latin<br>
alphabets) it is common for these organizations to establish a<br>
standardized set of place codes (pcodes for short) which are used to<br>
refer to places in datasets and communications. The pcodes work<br>
similarly to zip codes in the US or postal codes more generally<br>
elsewhere in the world. The list of pcodes is generally held by the<br>
UN and is used by almost every large humanitarian organization to<br>
communicate place information as the numerical codes prevent confusion<br>
due to multiple places with the same name, etc (just like postal codes<br>
are used).<br>
<br>
For the three countries in question, no pcodes had yet been generated<br>
so it was decided that the way they would be created was that the OSM<br>
dataset would be exported, a unique pcode generated for each village<br>
in the dataset, and then that would be adopted by these organizations<br>
as the official pcode for that location. This was done over the past<br>
week and we now have the results in a csv file with columns containing<br>
the OSM id of the place, the version of the place at export, lat/lon,<br>
and then all of the associated name tags and finally a column for<br>
pcode which was filled in by the people generating them.<br>
<br>
Since these newly generated pcodes are now the official pcodes for<br>
these places, we plan to import them back into OSM onto each place in<br>
the pcode=* tag. This allows the dataset to be much more easily used<br>
in the future, as well as allowing us to re-export and generate pcodes<br>
for newly added villages that do not already have them (since OSM is<br>
always growing). The exact format of pcodes varies from country to<br>
country due to the specifics of the countries involved. In some<br>
countries the pcode is chosen to be identical to the already existing<br>
postal codes. For these specific countries since there was no<br>
existing system the codes are formatted using the three letter country<br>
code, a 2 digit number indicating the significance of the place, and<br>
then a running 5 digit number counting up from 1 to identify the<br>
specific place name, so for example the code for the capital of<br>
Liberia, Monrovia, has the code LBR0400001.<br>
<br>
Since the codes are newly generated and used OSM as the source of<br>
their creation, the import would be very easy to do, and there is zero<br>
risk of the data being "wrong" (it was generated exactly this way, so<br>
it is by definition correct at this point). Also, there is no issue<br>
of licensing to be concerned about as the data in the file is OSM data<br>
to begin with, with the exception of the pcode, and we have explicit<br>
permission to use that as that was exactly the plan to begin with.<br>
<br>
Given that we have both the id/version pair in the dataset, as well as<br>
the lat/lon there are a couple ways the re-import of the pcodes could<br>
be accomplished. The easiest would be to use josm and the conflation<br>
plugin, with a very small match distance (like 5 meters or something)<br>
and then just manually conflate the handful of nodes that may have<br>
moved in the last week. The other possibility (which is more robust)<br>
is to write a simple script that adds the tags to the objects based on<br>
the id/version pair and then generates a list that must be manually<br>
processed for any objects which have a newer version than that at the<br>
time of export. I think either method will work, but I do think the<br>
script is a bit more robust, and we already have someone interested in<br>
working on the script to make it happen.<br>
<br>
Let me know if you forsee any potential problems with either of these<br>
two methods and how you think they could be addressed.<br></blockquote></div><br></div></div></div></div></div>