[Imports] Proposal for proper OSM import solution (OpenMetaMap)

Wed Aug 24 16:35:41 UTC 2011

On 08/18/2011 03:23 AM, Jaak Laineste wrote:
> But it is not plain "banning imports", it is to provide alternative
> tool (I call it OpenMetaMap) which enables to link various external
> data sources with OSM data, with much better flexibility than with
> current import approach.

I encourage you to take a long close look at
http://commonmap.org/

OSM was founded on a ground-up approach, at a time when open datasets 
were more rare than they are today.

Common Map appears to be founded on a "gather the world's best data 
sources then tweak them" approach.  By comparing with Common Map's goals 
you can help stake out where your particular proposal fits, or does not, 
fit in.  Common Map also anticipates returning corrections to the 
original source.

> On 08/18/2011 05:22 AM, Serge Wroclawski wrote:
>
> Underlying this approach is an assumption that we can rely on other
> datasets accuracy. Sadly this is not the case. As I work with more
> datasets and compare them to on the ground surveying, I find that many
> government datasets are either wrong, or out of date.
>
> Take TIGER as an example. I'm going through TIGER 2010 as we speak...
Tiger is a particularly bad train wreck, and a terrible example of a 
"typical" bad dataset.

However, the overall point is valid, and even high quality datasets are 
vulnerable to budget cuts or lack of attention.  The same is true of 
data inside OSM: some enthusiastic mapper enters features then looses 
interest.  It happens.  (As an example consider website: tags.  I wrote 
a checker for http://keepright.ipax.at/ and find that a good 20% of 
website tags don't match the node (anymore)).  The current effort to map 
POI's is particularly vulnerable to on the ground changes invalidating 
mapped data, as there are no cross checks.

A bit back I wrote a POI conflation tool for those cases where the 
external data source is indeed authoritative.  Because of licensing 
issues there are less importable examples than I thought at first, but 
there are some.

As an example consider the dataset I originally wrote the tool for: 
http://www.citycarshare.org/
That's a non-profit that runs car sharing locations.  This data is fully 
cross-checked to reality: if the reservation system advertises a car 
sharing location that does not exist the person who rents the car will 
walk to an empty parking space.  That data gets fixed in a hurry.

Radar and weather stations are another example.  If a new station starts 
sending data (but is not mapped) you know right away of a mapping 
inconsistency.

----

For any data set ask the question: if a condition on the ground shifts, 
who noticies, when do they notice, and how does the data get updated?  
For some data sets (like the car sharing data) the correction would be 
applied within hours.  For others (like a government database of parks) 
the answer may well be "nobody notices, nobody corrects it".

Community edits may result in the best data for some features,
imports for others, and live (two-way) conflation for yet different 
features.