[OSM-talk-be] OSM and SIAMU

Wed Nov 22 20:09:51 UTC 2017

Hi Nadia,

Nice to see you here!

I've played with the idea of unique identifiers for OSM objects myself
before. But it remains controversial in the international community (not so
much in Belgium). Here's an article I wrote long long time ago about it.
It's especially useful for the comments, which outline some of the problems
with my idea [1].
Also relevant to get a feel for the issues is when this proposition for a
global reviews database was discussed. Possibilities for linking were
investigated, and adding external IDs got quite a bit of headwind.

There has been a discussion about wikidata recently that turned so big that
I couldn't follow at all. But at least until recently, there seemed to be
an openness towards adding wikidata unique IDs. I don't know enough about
it to have a real opinion, but to me it sounds elegant to translate an
official source of streetnames into wikidata objects, then adding that
identifier to OSM. Maybe those more versed in Wikidata can explain.

That said, I'm not sure your proposed solution is the most simple solution
to the problem. Given that streetnames are given by the government, in
theory there is one and only possible way of writing the name. In Flanders,
that would be the CRAB name. In the very few cases where CRAB is still
wrong (or more to the point: the sign in the street says something slightly
different than what CRAB says), you could have name="Name on the Street
Sign" and something like name_official="Name in CRAB". In that situation,
the problem is different: how do make sure all the street names are and
stay correct in OSM. By coincidence, we are actually working towards doing
something like that. In the scope of the Road Completion project [1] we
want to start "attribute/tag comparison" real soon. Glenn as well has built
something that is even further along the line of being in production, where
we look for "close to this official road, there is no OSM road with the
same exact name".
Similar bit different, we developed a website last Open Summer of Code,
where official cycling network data is compared to OSM data all the time.
That way we can make sure our Brussel cycling network is always at least as
correct as the official data.
It's only a few more steps (not easy ones, I know) until we can work this
out further. Any difference in street names should then be fixed quite
quickly. I'd rather see you guys helping out in this effort, than starting
a cumbersome import.

As far as I know, those codes are only open data in Flanders (accidentally
through CRAB open data). One of the few rules about "what to map" is that
it should be verifiable (preferable by anyone, in the field). There are a
few exceptions, but they are rather rare. As long as the National Registry
codes are not open data, that sounds lie a real problem to me. In fact,
there is no way you can import data into OSM that is not open. Because then
we would have to re-license OSM with the license of the National Registry :)

One more thing is that using this ID will give you false certainty. You
will get your results, most of the time. But someone might have corrected a
segment (it used to have the name A, but it really is street B), and they
will not know what to do with this strange ref number. So even after a
succesful import, you would still need something like the constant
comparison described above to check if the streetname is still what the
unique identifier assumes it should be.

Ben and I have also spent a lot of time thinking about this problem in
general terms: "how do you keep external data synchronized to OSM". In the
case of roads it shouldn't actually be that hard. Say you start of with a
table joining the two datasets together based on the object IDs. You then
need to monitor how both datasets evolve. On the OSM side, you only have to
keep analysing segments that have changed a lot (say, the average
coordinate is too far away; the total length changed too much) or have
disappeared. Then you can have a process that finds if an object that is
similar enough is still mapped in the same place. Only when a certain
threshold is reached, there's a need for manual intervention to check what
is going on.
While this sounds complicated, I do think someone experienced in the field,
could build a model in a couple of days. I think the end result would
actually be more dependable than your idea, and probably less work to
implement. I've built something solving a similar problem in FME in not too
much time (a professional FME worker then re-built it in two days). Seppe
suggested that in the case of road data, a tool like OpenLR [5] might
actually already solve this problem. And Glenn seems to think this is quite
straightforward using Postgis.

Just out of curiosity: what kind of information do you have that is valid
at the level of a streetname?

1: http://www.openstreetmap.org/user/joost%20schouppe/diary/34328
2: https://lists.openstreetmap.org/pipermail/talk/2016-August/076498.html
3: http://www.osm.be/2017/01/06/en-project-road-completion.html
4: https://cyclenetworks.osm.be/brumob/
5: http://www.openlr.info/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-be/attachments/20171122/a52b5574/attachment.htm>