[OSM-talk] Worldwide non-surveyed tag edits

Roland Olbricht roland.olbricht at gmx.de
Wed Jun 11 21:28:06 UTC 2014


Dear John,

I'm glad that you got into discussion. The OpenStreetMap community has 
some consensus that look ouright nonsense from a computer scientist or 
programmiers usual point of view. So it is helpful to explain every now 
and then what is common sense, checking whether those decisions are 
still valid.

> Consistent data is useful and typos and mistakes are common place.
> Unifying these so they are machine readable so they are useful is, in
> fact, useful.

Just some examples:

We have streets with housenumbers 3, 5, 9, 7, 11, 13
Is it an obvious mistake? It's on purpose, because the housenumbers 
sometimes are in that order on the ground.

We have in Germany cities with a street named "Cäcilienstraße" and 
others with a street named "Cecilienstraße" (both with exactly the same 
pronounciation, and both variants of the same surname).

The literal translation of connecting way into German is 
"Verbindungsweg". This is also the offical name of a living street in 
Siegburg, Germany.

By contrast, for good reason not connected in the database are these roads:
http://blog.openstreetmap.de/blog/2013/05/wochennotiz-nr-147/

There was an automatic bot changing road names ending in "...strasse" to 
"...straße" (means "... street" in German, second is the standard 
spelling). This did fail both in Switzerland (where "...strasse" is the 
authorized spelling) and on the name "Gleistrasse", which means "railway 
track right of way" and only contains conincidentially the substring 
"strasse".

There are probably more examples. They don't leave much space for 
"obvious corrections" that are without doubt justified. That's why the 
rule exists that mechanical edits are accepted unless somebody complains:

If nobody complains then the edit was a posteriori a correction of the 
obvious. We have no a priori criterion for "obvious correction".

> The "rules" for mechanical edit are frankly ridiculous. Have you read them.

Our most valuable resource is not data but people who curate their share 
of data. Changing data in a way that might be considered harmful or is 
unintentionally outright wrong may shy away those who keep the data current.

The sometimes rude feedback was identified as a probable cause for 
OpenStreetMap having few contributing women.

So correcting those obvious errors requires communication with the 
mappers (male, female, or else) who have made these errors, in a way 
that always at first encourages them to carry on mapping (hopefully with 
less mistakes).

On the other hand, a mechanical change of data can be performed as easy 
during postprocessing than in the database. This is known in programming 
in "don't store an information when it is easier to recompute it".

You may earn real fame if you have a good filtering ruleset that 
flatirons all suspect data. If you publish this as a postprocessing 
script, it is useful. If you apply that to flatiron the database, in 99% 
justified cases and 1% on otherwise on purpose crafted data, then you 
will earn shame instead, because that same script could be perceived as 
doing vandalism.

It's potentially feasible to postprocess data. It's hard to collect 
data. So please don't make collecting data harder. Please make rather 
postprocessing data easier.

Best regards,

Roland




More information about the talk mailing list