[OSM-dev] visible-Flag in PBF

Sat May 7 22:43:11 BST 2011

On Sat, May 7, 2011 at 4:04 PM, Christian Vetter <veaac.fdirct at gmail.com> wrote:

>> There are about 80k blobs. If 1-byte tags are used for the counts,
>> overhead is:9 bytes each:
>>
>>   2 bytes indexdata tag&length in the BlobHeader
>>   3*1 bytes (tags for 3 fields)
>>   2*1 bytes (varint count for N==0)
>>   1*2 bytes (varint count for N < 2**14)
>>
>> I assume that few blobs contain more than one entity type. Using
>> booleans only saves one byte of overhead compared to this.
>
> I believe we can get away with 4 bytes:
> 2 bytes tag + length
> 1 + 1 byte for one field ( bool )
> We omit all fields that equal zero ( they are optional ) and the
> reader can then treat that as if it were set to zero

I think that this is a bad idea, because then you can't easily
distinguish between a count of zero and files written by a program
that doesn't set a count.

>
>>> About 312s to compress all blobs for Germany. Changing the dictionary
>>> size does not change much. I lowered it all the way down to 64kb and
>>> the values stayed the same essentially.
>>>
>>
>> And deflate?
>>
>
> 185s

Thank you.

Here's the tradeoffs: Lzma is about twice as slow as deflate to
compress and 10% smaller. Decompression should be a little slower than
deflate. Is that worth adding a LZMA dependency to any PBF reader?

My verdict is no. Protobufs and deflate have extensive language
support, LZMA doesn't and may be superseded by XZ.

Anyone want to make a compelling case for LZMA? Stefan?

Scott