[OSM-dev] Fwd: Script to extract bbox from history planet

Frederik Ramm frederik at remote.org
Thu Dec 16 10:16:13 GMT 2010


Hi,

On 12/16/10 10:13, Stefan de Konink wrote:
> I think Mitja did this in the GSoC program this summer (the cutout
> thing). It would be valuable to figure out a way to do 'proper' full
> ways and relations in a low memory setting, but probably I/O itself is
> always an issue too.

I'd be interested to hear from Mitja about his solution (even better, 
find it in SVN).

> I can imagine that you create a cutout of the nodes first, quicksort it
> in a mmap-file and then run it over the ways table. Storing a mmap file
> with only the way ids,quicksort on it and walk over the relations part.
>
> Now a second run would grab the missing ways, and the remaining nodes.

If you allow multiple passes then no need to do sparse stuff - just use 
bit vectors to flag used nodes, ways, relations. Takes 200 MB of RAM at 
max and no sparse-set/sorting/searching overhead.

Of course the multiple passes is what kills performance, especially if 
you have to bunzip2/xml-parse each time.

In addition, it would be cool for a history excerpt to always extract 
*all* versions of an object even if only some of them actually match the 
criterion. That could be trivially added if the input is sorted - simply 
start a buffer at version 1 and dump it when the next object starts.

Bye
Frederik



More information about the dev mailing list