[OSM-dev] [Imports] Inserting OSM data

Frederik Ramm frederik at remote.org
Sat Mar 27 19:10:35 GMT 2010


Hi,

Jaak Laineste wrote:
> There could be quite good reasons to protect some of the data at least
> temporarily. 

Let's look at them then.

> Very technical reason: to avoid accidental deletion of nodes
> during bulk import (which takes days sometimes). 

Happened to me - but to be honest, only because I was running the import 
in a stupid way. Today I would do a changeset upload which creates nodes 
and the ways using them - resulting in a transactional update on the 
server which means that nobody will see my nodes before my ways are in.

> Well, maybe bulk import in
> general is not really fully compatible the spirit of OSM after all. What is
> more important purpose of OSM: is it the biggest outdoor mapping capturing
> tool, or does it want to be the world largest and best community-created map
> database? 

I wanted to discuss "private/locked data in OSM", not whether or not 
bulk imports are any good. I have an outspoken opinion on this but will 
save it for another thread.

> My implicit
> assumption was that OSM wants to be as good database as possible, but I
> could also have totally missed the point of OSM. 

Your message is an example of good rhetorics, but not one of stringent 
logic. Imports may help OSM to become a better database, or they may be 
the ruin of the community - whatever one's opinion, it has nothing to do 
with locking data.

> There are good datasources (from public sector) who have 80% of their data
> open and in principle well compatible with OSM, but 20% of them should have
> some protection. Technically splitting the data could be so complicated that
> their only option now is not to share anything, i.e. just not to use OSM. 

The example you are about to give is an example of someone wanting 
protection for 100% of his data.

> I have a particular example: a friend just called me, and he is in board of
> national assiocation of museums. They have and maintain kind of official
> database of all museums in the country. They wanted to have them on web map,
> and I suggested to use OpenStreetMap, and not only as background image, but
> also insert their data as points to the OSM. This bought me several
> questions: 
> - is the only legitimate way to have one-time bulk import, and then just
> hope that community will only improve it? Or could they have a bit more
> special control (external IDs, notifications, soft locking of some tags etc)
> over the data, at least to make their data maintenance easier. To enable
> more automatic sync with their in-house data maintenance systems and
> procedures.

My view on this is very clear: I do not want data in OSM that I cannot 
edit. Un-editable data in OSM (basically a static copy of a data set 
maintained by someone else) increases the bulk of our data but doesn't 
improve the quality of OSM. It is dead easy to take such a data set and 
mix it in at rendering time, so if they want their official museums on a 
map, they need only configure Mapnik to load museum locations from their 
shape file and that's it.

In fact, if *anyone* wants the "official" rather than the community 
maintained set of museums on the map, and assuming that the official 
list is available as a shape or GPX, it is absolutely no problem for 
anyone to mix them in from such a source at render time, and we should 
indeed strive to make this even easier so that people don't get the idea 
that everything they want to display on a map has to be imported to OSM 
first!

If we allow such pseudo-imports to go ahead (i.e. where we import a copy 
of the data but we do not become the master), then we will end up with 
semi-maintained "shadows" of every Geodatabase in existence - we'll be 
the Geodata thrash heap for the world.

For that specific case you mention I'd say either they keep their data 
out of OSM, or they set up something that helps them monitor changes to 
"their" data. If you remember, you initially raised four points, of 
which I discounted I-III as being un-OSM because they would protect the 
data from edits; IV was the idea of being able to better monitor things, 
which I said was a desire shared by many.

My message to anyone contemplating imports into OSM is clear - either 
accept that others change the data (and you may monitor and you may 
change back if you have good reason), or don't import.

> - Today the only way for them is anyway double maintenance: they maintain
> their internal/primary database, and maybe they care to copy their
> day-to-day updates manually also to OSM.

Or of course just publish their data and let the OSM community do the rest.

> Is there a way to make maintenance
> of only their specific data in OSM easy?

I think we're having different pictures here. OSM is not a platform 
which you can use to host "your" data. The data ceases to become "yours" 
once it is in OSM. If that fundamental idea doesn't get across, then 
further discourse is fruitless.

If they are willing to accept the paradigm shift and say: We used to be 
the keeper of the official list of museums, but now we see ourselves as 
a guardian of community museum data in OSM - checking it, refining it, 
and making sure it doesn't get damaged - then there might be common 
ground.

You are of course right in saying that it is currently beyond a mere 
"point and click" user perform such a guardian role in OSM.

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frederik at remote.org  ##  N49°00'09" E008°23'33"




More information about the dev mailing list