[OSM-talk] osm2pgsql diff application with filtered OSM data

Nick Whitelegg nick.whitelegg at solent.ac.uk
Sun Nov 11 15:53:41 UTC 2018


... append mode!




________________________________
From: Nick Whitelegg
Sent: 11 November 2018 15:53:18
To: talk at openstreetmap.org
Subject: Re: [OSM-talk] osm2pgsql diff application with filtered OSM data



Thanks for all the replies.


After thinking about this, I realised that I don't really want to update _all_ the data that often. The only thing I need to update on a weekly basis is the footpaths (I'm not so bothered if say the roads, or the pubs are a year out of date - as long as newly mapped footpaths appear quickly). So what I'm now doing is just doing an osmosis extract of paths weekly, deleting all data in the DB which I class as a 'path' and repopulating in amend mode.


Thanks,

Nick


________________________________
From: Paul Norman <penorman at mac.com>
Sent: 08 November 2018 20:10:14
To: talk at openstreetmap.org
Subject: Re: [OSM-talk] osm2pgsql diff application with filtered OSM data

On 2018-11-08 6:34 AM, Nick Whitelegg wrote:

At the moment I download full planet extracts about every 6 months. However, due to the limitations of my server, I filter out (with osmosis) a lot of stuff I don't need so that I am basically left with roads, footpaths, natural features, water features and selected POIs.


I'd like to move towards a system which applies diffs from geofabrik instead, and applies them regularly (daily or weekly) with osm2pgsql.


My question is this; given that not everything in the diff will be in my database (as I filter out what I don't need during the import process), will osm2pgsql apply the diff successfully or will it complain that not all features in the diff are in my database?

I can think of four ways to do this, all which have a different balance of correctness, performance, and ease of use.

There are two "right" ways to do this. The first one is to re-import every week. Because imports without slim tables (either --slim --drop or no --slim) are faster, this is a good option and needs less space than a database able to consume diffs.

The second right way involves keeping two files, one with the current full data, and one with the filtered data. Call these "planet.pbf" and "planet-filtered.pbf". Then when updating create "planet-new.pbf", filter it to get "planet-filtered-new.pbf", create a diff for the differences between "planet-filtered-new.pbf" and "planet-filtered.pbf", and apply that diff to the database. Then replace the old files with the new ones. This will keep the database correct.

A "wrong" way to do it is to import the filtered data, apply updates directly, and periodically delete data from the DB. The problem with this is that if someone adds one of the selected POI tags to a building that you have filtered out, osm2pgsql won't have the node data to create a geometry. This might be acceptable, depending on use case.

A less wrong way would be to modify your filtering so no nodes are filtered out. There are still potential errors with relations, but these are less common. If you're doing the planet or a large extract and using flat nodes there's no storage penalty for having all the nodes.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20181111/f8207bfb/attachment-0001.html>


More information about the talk mailing list