[OSM-dev] Block sizes in PBF Format

Scott Crosby scrosby at cs.rice.edu
Tue Nov 30 16:30:46 GMT 2010


On Tue, Nov 30, 2010 at 3:28 AM, Jochen Topf <jochen at remote.org> wrote:
> The PBF_Format wiki page states: "The length of a Blob *should* be less than 16
> megabytes and *must* be less than 32 megabytes." But forther down it says "I
> collect 8k entities to form a PrimitiveBlock, which is serialized into the
> Blob..."
>
> So what happens if the 8k entities take up more than 32 megabytes? Thats 4k
> per entity, which could be reached with large relations. Well, we need quite
> a few of those large relations, but its good to know where the limits of the
> format are and they should be clearly documented.

I chose those limits so that software could reject bad files without
crashing due to running out of RAM. However, if the limits cause
problems with storing those big relations, that is a limitation in my
osmosis implementation, not in the design of the format.

For simplicity in my implementation, I had the osmosis serializer use
the same number of entities in each block, and made that a command
line option (for testing purposes). Nothing in the format requires a
fixed number of entities in each block, and a better implementation
could operate with a variable number of entities in a block, starting
a new block whenever it estimates that the current one is 'too big'.

A short-term workaround might be to store only 2k relations in a block.

Thanks for the question, I have changed the wiki to note that the 8k
entities in a block is an implementation decision, and at the same
time to note that the size limits on a blob are uncompressed sizes.

Scott



More information about the dev mailing list