[OSM-dev] visible-Flag in PBF

Sun May 8 10:15:21 BST 2011

On Sat, May 07, 2011 at 08:25:49PM +0200, Christian Vetter wrote:
> On Sat, May 7, 2011 at 5:46 AM, Scott Crosby <scott at sacrosby.com> wrote:
> > On Fri, May 6, 2011 at 9:24 PM, Christian Vetter <veaac.fdirct at gmail.com> wrote:
> >> With regard to LZMA: I have some C++ code lying around to compress /
> >> decompress LZMA... I can test how much it would affect file size /
> >> decoding speed.
> >
> > Cool. You don't need a full-fledged PBF reader&writer to test it. Just
> > enough to parse out blobs and write blobs.
> 
> I quickly hacked it into MoNav's importer and tested it on the extract
> of Germany. I used maximum compression ( dictionary size == blob size
> ):
> 
> size of zlib blobs: 849MB
> size of lzma blobs: 762MB
> time spent decoding zlib blobs: 6.986 s
> time spent decoding lzma blobs: 49.078 s
> 
> We can reduce the size a bit by using lzma ( ~10% ) and adding it
> isn't much work ( about 10 lines of code for encoding / decoding ).
> However, it doesn't seem worth it, considering that it makes parsing
> slower. Increasing the block size would most likely not increase the
> compression: I tested compressing all uncompressed blobs at once using
> a 64MB dictionary and the size only decreased to 728MB

For me that makes the decision easy then: 10% space savings is not worth
the extra time needed for the decompression. 10% space savings are a) not
a big issues for current disk sizes and b) will be eaten up in a few months
of OSM growth. Time savings on the other hand are my biggest issue in most
applications with OSM, because I want to work with data thats as current
as possible.

That being said I would not object to adding an lzma option if others have
different priorities.

Jochen
-- 
Jochen Topf  jochen at remote.org  http://www.remote.org/jochen/  +49-721-388298