[Imports] Proposal for proper OSM import solution (OpenMetaMap)
bryce2 at obviously.com
Wed Aug 24 16:35:41 UTC 2011
On 08/18/2011 03:23 AM, Jaak Laineste wrote:
> But it is not plain "banning imports", it is to provide alternative
> tool (I call it OpenMetaMap) which enables to link various external
> data sources with OSM data, with much better flexibility than with
> current import approach.
I encourage you to take a long close look at
OSM was founded on a ground-up approach, at a time when open datasets
were more rare than they are today.
Common Map appears to be founded on a "gather the world's best data
sources then tweak them" approach. By comparing with Common Map's goals
you can help stake out where your particular proposal fits, or does not,
fit in. Common Map also anticipates returning corrections to the
> On 08/18/2011 05:22 AM, Serge Wroclawski wrote:
> Underlying this approach is an assumption that we can rely on other
> datasets accuracy. Sadly this is not the case. As I work with more
> datasets and compare them to on the ground surveying, I find that many
> government datasets are either wrong, or out of date.
> Take TIGER as an example. I'm going through TIGER 2010 as we speak...
Tiger is a particularly bad train wreck, and a terrible example of a
"typical" bad dataset.
However, the overall point is valid, and even high quality datasets are
vulnerable to budget cuts or lack of attention. The same is true of
data inside OSM: some enthusiastic mapper enters features then looses
interest. It happens. (As an example consider website: tags. I wrote
a checker for http://keepright.ipax.at/ and find that a good 20% of
website tags don't match the node (anymore)). The current effort to map
POI's is particularly vulnerable to on the ground changes invalidating
mapped data, as there are no cross checks.
A bit back I wrote a POI conflation tool for those cases where the
external data source is indeed authoritative. Because of licensing
issues there are less importable examples than I thought at first, but
there are some.
As an example consider the dataset I originally wrote the tool for:
That's a non-profit that runs car sharing locations. This data is fully
cross-checked to reality: if the reservation system advertises a car
sharing location that does not exist the person who rents the car will
walk to an empty parking space. That data gets fixed in a hurry.
Radar and weather stations are another example. If a new station starts
sending data (but is not mapped) you know right away of a mapping
For any data set ask the question: if a condition on the ground shifts,
who noticies, when do they notice, and how does the data get updated?
For some data sets (like the car sharing data) the correction would be
applied within hours. For others (like a government database of parks)
the answer may well be "nobody notices, nobody corrects it".
Community edits may result in the best data for some features,
imports for others, and live (two-way) conflation for yet different
More information about the Imports