[OSM-dev] pbf2osm development has started [code to test!]

VeaaC FDIRCT veaac.fdirct at gmail.com
Thu Sep 30 11:05:24 BST 2010

I implemented a PBF parser for MoNav. The speedup over XML and
compressed XML is very good: It takes about 64 seconds to parse the
German extract instead of 760 seconds.

> But what I already discussed with Scott, we *need* a good 'has everything'
> PBF file. Something that can test a parser and has expected output.

I agree to this. I do not want to release a new version of MoNav with
PBF support if parts of my code remaining untested. Maybe a set of
unit test files could be provided, each testing a different feature.
Some I can think of:
 - All compression types
 - All basic data type ( Nodes, Ways, Relations, DenseNodes )
 - Unsupported data types via new required features
 - Illegal / corrupted files:
  * Corrupt lzma / zlib / gzip2 data
  * Corrupt size
  * Corrupt Protobuf data
  * Empty PrimitiveGroups
  * PrimitiveGroups with more than one data type
  * Empty PrimitiveBlocks
  * Illegal string IDs
  * Arrays which should be same size have different sizes
  * etc...

Some additions that might make parsing faster for me would be:
 a) An optional feature to have entities sorted topologically, that
is, a Node / Way / Relation is written to the file only after all
entities referencing it have been written.
 b) The possibility to glance from the BlockHeader what kind of data
the Blob contains. E.g., whether it contains nodes / ways / relations.


Christian Vetter

[1] MoNav Website: http://code.google.com/p/monav/

More information about the dev mailing list