[OSM-talk] osm2pgsql & planet: frustrations, cutoffs, and idempotence

Jochen Topf jochen at remote.org
Mon Oct 27 11:19:19 GMT 2008


On Mon, Oct 27, 2008 at 08:22:32AM +0000, Tom Hughes wrote:
> Shaun McDonald wrote:
> > On 27 Oct 2008, at 00:50, Michal Migurski wrote:
> > 
> >>> Planet dumps are not snapshots - they do not represent a consistent
> >>> view at any particular point in time because they take a number of
> >>> hours to generate, during which time new changes are constantly
> >>> being made to the contents of the database.
>  >>
> >> Shouldn't it be possible to ignore any changes that happen after the
> >> cutoff, though?
> > 
> > At the moment we don't look at the time stamps when dumping the planet  
> > file.
> 
> It's not as simple as that - you also have to switch to reading the 
> history tables rather than the current tables or you won't be able to 
> see what the state of the object used to be if it has changed since the 
> snapshot time.
> 
> Which means you're reading much more data, and either having to track 
> the state of each object (in order to find the most recent valid change) 
> or you have to index scan so that you're seeing things in timestamp order.

If the planet dump plus the diff from the same day is what everybody
wants anyway, why not do this on the server side and hold the planet
back after the first diff is available, run this over the planet and
then publish that as the planet?

Jochen
-- 
Jochen Topf  jochen at remote.org  http://www.remote.org/jochen/  +49-721-388298





More information about the talk mailing list