[Imports] "readonly" tag for imported data (ask "simple" editors to not modify)?

Mon Apr 25 13:10:41 UTC 2011

On 25.04.2011, at 12:40, Frederik Ramm wrote:

> Hi,
> 
> Jaak Laineste wrote:
>> What about following approach:
>> a) ban external source imports to the main database
>> b) create new separate solution (layer, server, API, whatever fits) for all the imports.
> 
> I think that is the way forward. If your data has special requirements, then put it in a special database, not in OSM. Do not abuse OSM as a vehicle for non-crowdsourced (or even non-crowdsourcable) data.

So I see two approaches here:
a) centralized database - collect imports to one place, close enough to OSM. I was thinking about this, and there are good practical reasons for it; similar reasons why OSM itself is centralized. This does not mean that in 2011 this is the only possible solution.

b) distributed database - each database (data collection) has own home, and there is one common directory to find them. Google has this kind of system for transit data: they define format what everyone has to use, create common directory and everyone (at least Google itself) can collect data directly from different agencies. Technically OSM itself could be distributed: e.g. each country keeps own coverage in own servers.

> 
>> - planet.osm it would have the data in special section. 
> 
> No, planet.osm would not even know about that data. I imagine this to be something like in the brave new world of git: You can run a rendering server and you can instruct your osm2pgsql to pull data from the OSM planet repository, and from Alan's "US Borders" repository, and from someone else's "International Maritime Lights" database, etc.etc.; an expert editor could even allow you to do the same, pull data from different sources, even edit all of it if you have the proper credentials and upload back to all the different sources.
> 
> There is of course still a case for many things to reside in OSM - e.g. it would not make much sense to have one database with major roads and one completely separate database with minor roads, as it would always be a terrible pain to extract from them a working, routable network. But especially where we're talking non-physical things like boundaries, such a de-coupling makes a lot of sense to me.

Radical separation could be used in some cases, but not in many others. Specific sample: I have database with all building shapes of a 0.5-million city. Number of shared nodes with roads is probably small, but OSM probably has already some 1% of similar data (buildings). I also want to get contribution from OSM community back to my database (otherwise it would be only giving). I'd come out with following:
1. I'll create OSM API to my database. 
2. I register my URL to the OSM meta-database (something remotely similar to imports-table in wiki).
3. I'll integrate user authentication with OSM OAuth
4. If editor uses JOSM it first makes request "give data providers for bbox" from OSM API, and then does HTTP requests to all of them.
5. In JOSM you see and can edit all the different sources (possibly with different colors - in different JOSM layers). Special action there would be "merge nodes on different layers", which would change node id of an external database to the OSM node id, with this creates topological link between the objects. Another special one would be creation of Relation with objects from different sources. 
6. If you save data, it can go to many different databases.

 Data provider can also just create OSM files and give read-only http access to them. But this would be very limited and would lead to manual copy of data to OSM database, otherwise topological laundry could not be done.

What would be needed:
a) "OpenGeoMetaDatabase" - directory of OSM API-compatible data sources (special discovery API). Initially with one source: OSM
b) support of OGMD in JOSM, osm2plsql and other tools - should be quite easy.
c) some reference cases where data provider would be ready to actually provide live OSM API. Simplest case: set up rails_port with full OSM data structure, then use same import procedure as you would with OSM. In real life you'd like to have live API on top of your legacy geodatabase, which means many months/years projects. Luckily there are some common GIS engines (ESRI, Oracle Spatial etc) what they probably have. 
d) probably some additions are still needed in OSM API too. Sample case that you save relation which has non-OSM objects to OSM database - you must be able to refer to external object IDs. Maybe object ID (osmid) should have more generalized format (e.g. "corinefr-12343212").

 Then organizational question: could it be part of OSM (as Meta-OSM), or something external and independent? I'd prefer the first option, but some (like Frederik) would like to keep well apart from the core OSM?

BR,
Jaak
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/imports/attachments/20110425/5cefac4e/attachment.html>