[OSM-talk] First drop in planet size ?

Nic Roets nroets at gmail.com
Sat Mar 13 20:31:13 GMT 2010


Hello James,

I wanted to split the planet into overlapping bboxes like this (click
to see actual size):
http://dev.openstreetmap.de/gosmore/

On talk I described how I was dissatisfied with osmosis's memory
consumption. So I came up with this observation: Most entities will
end up in one or two extracts. And when it's two, it's in a pattern
that is often repeated, say Africa bbox and Middle East bbox. Never
Africa and Canada. So of the 2^168 possible combinations only around
3000 is actually used.

So bboxSplit allocates 16 bits for each entity. Those are then indexes
into the array of 'youniouns'. If a new node comes along, I check it
against list of bboxes and it typically matches 1 or 2. So to find out
quickly if I already have that combination of bboxes, I also have an
STL map on the array of younions. A hashtable would have been faster.

Ways and relations also trigger the code that merge younions.

bboxSplit is faster than the corresponding bunzip and any program that
uses libxml, i.e. very fast.

Regards,
Nic

On Sat, Mar 13, 2010 at 10:03 PM, jamesmikedupont at googlemail.com
<jamesmikedupont at googlemail.com> wrote:
> That is very deep c++ code!
> care to comment on how it works?
> would be very interested to understand its performance ! looks very fast.
> mike
>
> On Sat, Mar 13, 2010 at 7:06 PM, Nic Roets <nroets at gmail.com> wrote:
>>
>> My understanding is that all Xml compliant* parsers will abort at the
>> file offsets that Frederik mentions.
>> My advice is to use the egrep filter when in doubt, because you will
>> loose no more than a dozen lines in a planet file of billions of
>> lines.
>>
>> *: (My split program is not compliant and will happily ignore these
>> errors:
>>
>> http://trac.openstreetmap.org/browser/applications/rendering/gosmore/bboxSplit.cpp)
>>
>> On Sat, Mar 13, 2010 at 7:44 PM, John Mitchell <mitchelljj98 at gmail.com>
>> wrote:
>> > Will this also be a problem if you try to import via osm2pgsql into
>> > postgres?
>> >
>> > Thanks,
>> >
>> > John
>> >
>> > On 3/13/10, hbogner <hbogner at gmail.com> wrote:
>> >> Thx for help, I'll try it.
>> >>
>> >> Now I have to follow 'dev' too :D
>> >>
>> >> Nic Roets wrote:
>> >>> There's a bug in the code that generated this week's planet. You
>> >>> should either wait until next week or filter the planet with the
>> >>> following command:
>> >>> bzcat /osm/planet-10*.osm.bz2 |egrep -v '&#[0-9]*;'|...
>> >>>
>> >>> There has been a long discussion on 'dev', mentioning other remedies.
>> >>>
>> >>
>> >>
>> >> _______________________________________________
>> >> talk mailing list
>> >> talk at openstreetmap.org
>> >> http://lists.openstreetmap.org/listinfo/talk
>> >>
>> >
>> >
>> > --
>> > John J. Mitchell
>> >
>> > _______________________________________________
>> > talk mailing list
>> > talk at openstreetmap.org
>> > http://lists.openstreetmap.org/listinfo/talk
>> >
>>
>> _______________________________________________
>> talk mailing list
>> talk at openstreetmap.org
>> http://lists.openstreetmap.org/listinfo/talk
>
>




More information about the talk mailing list