[osmosis-dev] Protobuf OSMHeader format
Scott Crosby
scrosby at cs.rice.edu
Tue Nov 23 13:16:47 GMT 2010
On Tue, Nov 23, 2010 at 12:09 AM, Rainer Kluge <rkluge50 at web.de> wrote:
> Hello,
>
> While trying to decode PBF files with a Perl script, I encounter problems
> with the OSMHeader fileblock. The content of the uncompressed content of the
> block is:
>
> 0a 1a 08 fe e3 86 8a 47 10 80 ae e9 da 4d 18 80 9c f8 eb ea 02 20 80 88 8d
> 8f e7 02 22 0e 4f 73 6d 53 63 68 65 6d 61 2d 56 30 2e 36 22 0a 44 65 6e 73
> 65 4e 6f 64 65 73 82 01 04 30 2e 33 38 8a 01 24 68 74 74 70 3a 2f 2f 77 77
> 77 2e 6f 70 65 6e 73 74 72 65 65 74 6d 61 70 2e 6f 72 67 2f 61 70 69 2f 30
> 2e 36 00
>
> According to the specification at
> http://wiki.openstreetmap.org/wiki/PBF_Format the structure of the block
> is:
>
> message HeaderBlock {
> optional HeaderBBox bbox = 1;
> /* Additional tags to aid in parsing this dataset */
> repeated string required_features = 4;
> repeated string optional_features = 5;
>
> optional string writingprogram = 16;
> optional string source = 17; // From the bbox field.
> }
>
> With the above data, this results in:
>
> key: 0a -> type=length-delimited field_number=1 -> Headerbox bbox
> length: 26
> value: 08 fe e3 86 8a 47 10 80 ae e9 da 4d 18 80 9c f8 eb ea 02 20 80 88 8d
> 8f e7 02
>
> key: 22 -> type=length-delimited field_number=4 -> string required_features
> length: 14
> value: 4f 73 6d 53 63 68 65 6d 61 2d 56 30 2e 36 "OsmSchema-V0.6"
>
> key: 22 -> type=length-delimited field_number=4 -> string required_features
> length: 10
> value: 44 65 6e 73 65 4e 6f 64 65 73 "DenseNodes"
>
> key: 82 -> type=length-delimited field_number=16 -> string writingprogram
>
The tag+decoding bits has value 0x82. This value is then VarInt encoded
before the field contents. Since 0x82 is bigger than can be expressed in one
byte in VarInt, it is encoded as 0x82 0x01.
Thats also why I used 16 as a tag number, to save the 1-byte tags for future
use
> Next should follow the field length of the length-delimited string as a
> varint encoded integer, followed by the specified number of characters.
> However, there is:
>
> 01 04 30 2e 33 38 8a
>
> which would be fine *without* the leading 0x01:
>
> length: 4
> value: 30 2e 33 38 8a "0.38"
>
> What is this extra 01 at the beginning of the data length, a
> misunderstanding on my side, a problem in the doc or a bug?
>
Hope my explanation above helped. You're looking at protobufs in fine detail
--- I know the encoding format, but I've never looked at them at the hexdump
level; I've always assumed that the library does the right coding. A safe
assumption, given how much Google uses them internally.
Scott
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/osmosis-dev/attachments/20101123/2750991a/attachment.html>
More information about the osmosis-dev
mailing list