[OSM-dev] New, faster, planet dump tool

Jon Burgess jburgess777 at googlemail.com
Tue Sep 25 14:07:33 BST 2007


On 25/09/2007, Brett Henderson <brett at bretth.com> wrote:
> Jon Burgess wrote:
> > I wrote this version because it looked like the runtime of planet.rb
> > was becoming an issue again. The code should be faster than Osmosis
> > but I have not run a specific test with this current data set. It
> > should be the bzip2 compression time which will dominate the planet
> > dump time.
> >
> > planet.c and osmosis do have slightly different aims. The osmosis code
> > is a far more generic implementation but I suspect its speed will
> > never be able to match a custom C implementation. On the other hand,
> > generating diffs directly from the DB using Osmosis would be far
> > quicker then using the current planet dump + planetdiff tools.
> >
> > I'd be happy to use either tool provided they both achieve the same
> > result within time and memory constraints.
> >
> > Anoner possibility is to use the planet.c code to stream a DB dump
> > into the PostgreSQL mapnik database. Avoiding the bzip2 compression
> > should allow this to be done quite rapidly. We could then update the
> > Mapnik layer more frequently than the formal weekly planet dump.
> >
> Is there any way to feed changes into the mapnik db instead of doing a
> complete import?  I seem to remember a previous email where you
> mentioned you'd need to store some translation data between the osm and
> mapnik representations which is currently held in RAM ...  From memory
> the reason for RAM is speed, but if the changes are small (relatively to
> total data set) then it may be worthwhile having a db based intermediate
> step.
>
> If possible, it might allow mapnik to be updated in near real-time.

Yes, it is possible in theory to do incremental updates, so long as
the  PostgreSQL database has the details on all nodes, segments and
ways. As you mention, the '--slim' modes does store all this
information and it should be possible to apply the diff on top of this
and work out which geometry entries need to be updated.

Historically the RAM code has been so much faster than the DB backed
that even applying a weeks worth of updates would probably be slower
than reloading the whole thing. If the updates were very small
near-real time changes then it could probably be done much faster.

There are quite a few complications to implement this in practice so
it would probably take me some time to get the possible glitches
resolved.

-- 
    Jon




More information about the dev mailing list