<div dir="ltr"><div><div>Hi Andrew,<br><br>I agree taht pairing the OSM Ids will be the most simple and robust solution.<br>Regarding the Pcodes methodology, seems an old method (that was used eg in Haiti) has been abandonned (or you did not mention it), whas is really fortunate: concatenate admin level IDs (every admin level having an integer number between 1 and x). Seems practical at first glance, but a real nightmare as admin boundaries can change. Basically this system created unique IDs but not for unique objects. And in Haiti, where admin level had been reorganized by the government a few years before the Earthquake, GIS people were more fighting against the pcodes than able to use them. <br><br></div>Sincerely,<br><br></div>Severin<br><div><div><div><br><br><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Date: Wed, 24 Sep 2014 12:54:12 -0500<br>

From: Andrew Buck <<a href="mailto:andrew.r.buck@gmail.com">andrew.r.buck@gmail.com</a>><br>

To: "Imports OpenStreetMap.org" <<a href="mailto:imports@openstreetmap.org">imports@openstreetmap.org</a>><br>

Subject: [Imports] Adding pcodes to villages in Liberia,        Sierra Leone<br>

        and Guinea<br>

Message-ID: <<a href="mailto:54230544.6040007@gmail.com">54230544.6040007@gmail.com</a>><br>

Content-Type: text/plain; charset=ISO-8859-1<br>

<br>

-----BEGIN PGP SIGNED MESSAGE-----<br>

Hash: SHA1<br>

<br>

Hello everyone.  As you are probably aware HOT has been working to map<br>

the areas affected by Ebola in west Africa and to help humanitarian<br>

organizations better use OSM data in their efforts there.  Because OSM<br>

has the best dataset of settlements (towns, villages, etc) in the area<br>

several prominent groups have chosen to standardize on OSM being their<br>

official source for place names and locations.<br>

<br>

Due to the issue that many place names in Africa (and elsewhere) have<br>

many different spellings (due to the local languages not using latin<br>

alphabets) it is common for these organizations to establish a<br>

standardized set of place codes (pcodes for short) which are used to<br>

refer to places in datasets and communications.  The pcodes work<br>

similarly to zip codes in the US or postal codes more generally<br>

elsewhere in the world.  The list of pcodes is generally held by the<br>

UN and is used by almost every large humanitarian organization to<br>

communicate place information as the numerical codes prevent confusion<br>

due to multiple places with the same name, etc (just like postal codes<br>

are used).<br>

<br>

For the three countries in question, no pcodes had yet been generated<br>

so it was decided that the way they would be created was that the OSM<br>

dataset would be exported, a unique pcode generated for each village<br>

in the dataset, and then that would be adopted by these organizations<br>

as the official pcode for that location.  This was done over the past<br>

week and we now have the results in a csv file with columns containing<br>

the OSM id of the place, the version of the place at export, lat/lon,<br>

and then all of the associated name tags and finally a column for<br>

pcode which was filled in by the people generating them.<br>

<br>

Since these newly generated pcodes are now the official pcodes for<br>

these places, we plan to import them back into OSM onto each place in<br>

the pcode=* tag.  This allows the dataset to be much more easily used<br>

in the future, as well as allowing us to re-export and generate pcodes<br>

for newly added villages that do not already have them (since OSM is<br>

always growing).  The exact format of pcodes varies from country to<br>

country due to the specifics of the countries involved.  In some<br>

countries the pcode is chosen to be identical to the already existing<br>

postal codes.  For these specific countries since there was no<br>

existing system the codes are formatted using the three letter country<br>

code, a 2 digit number indicating the significance of the place, and<br>

then a running 5 digit number counting up from 1 to identify the<br>

specific place name, so for example the code for the capital of<br>

Liberia, Monrovia, has the code LBR0400001.<br>

<br>

Since the codes are newly generated and used OSM as the source of<br>

their creation, the import would be very easy to do, and there is zero<br>

risk of the data being "wrong" (it was generated exactly this way, so<br>

it is by definition correct at this point).  Also, there is no issue<br>

of licensing to be concerned about as the data in the file is OSM data<br>

to begin with, with the exception of the pcode, and we have explicit<br>

permission to use that as that was exactly the plan to begin with.<br>

<br>

Given that we have both the id/version pair in the dataset, as well as<br>

the lat/lon there are a couple ways the re-import of the pcodes could<br>

be accomplished.  The easiest would be to use josm and the conflation<br>

plugin, with a very small match distance (like 5 meters or something)<br>

and then just manually conflate the handful of nodes that may have<br>

moved in the last week.  The other possibility (which is more robust)<br>

is to write a simple script that adds the tags to the objects based on<br>

the id/version pair and then generates a list that must be manually<br>

processed for any objects which have a newer version than that at the<br>

time of export.  I think either method will work, but I do think the<br>

script is a bit more robust, and we already have someone interested in<br>

working on the script to make it happen.<br>

<br>

Let me know if you forsee any potential problems with either of these<br>

two methods and how you think they could be addressed.<br></blockquote></div><br></div></div></div></div></div>