[OSM-dev] OSM PBF and spatial characteristics of blocks
andrew at fastmail.net
Tue Jan 5 17:56:07 UTC 2016
It’s interesting to see more demand for grabbing up-to-date chunks of PBF around the world.
Keep in mind that while OSMPBF is an essential export format (as the most efficient widely used OSM data format), it is not the only option for storing the OSM entities in your spatially indexed / geographically clustered planet-scale container.
We’ve got similar needs (storage of OSM data at planet scale, rapid extraction of up-to-date, compact binary OSM data for any location) and created the Vanilla Extract project to meet these needs. There’s the original C version (which is fast but doesn’t do minutely updates) and a newer MapDB based Java version that does minutely updates but seems less performant in practice.
You might be able to extend or borrow code from these projects to meet your needs:
> On 05 Jan 2016, at 18:40, Stadin, Benjamin <Benjamin.Stadin at heidelberg-mobil.com> wrote:
> Thank you. This is enough clarification for me. Then I’ll create an independent store (using OSM PBF format but using spatial clustering) and on export the required order for the region will be recreated.
> Von: Paul Norman <penorman at mac.com <mailto:penorman at mac.com>>
> Datum: Dienstag, 5. Januar 2016 um 18:09
> An: "dev at openstreetmap.org <mailto:dev at openstreetmap.org>" <dev at openstreetmap.org <mailto:dev at openstreetmap.org>>
> Betreff: Re: [OSM-dev] OSM PBF and spatial characteristics of blocks
> On 1/5/2016 8:32 AM, Stadin, Benjamin wrote:
>> I’m thinking about a design for an efficient storage container for OSM PBF (planet size data, minutely updates), for the purpose of TileMaker as well as for an internal application.
> Good to see Tilemaker (https://github.com/systemed/tilemaker <https://github.com/systemed/tilemaker>) getting some traction.
>> One thing I stumbled on is the usage of the bounding boxes within OSM PBF. The documentation  does not clarify on the spatial characteristics of the individual FileBlocks. Some questions:
>> Is it correct that there is exactly one HeaderBlock in a .pbf file? If so, the BBOX defined within the HeaderBlock defines the whole region of the .pbf export?
>> What are the spatial characteristics of an individual FileBlock within the FileBlocks sequence? Is a FileBlock generated by any kind of spatial ordering? For example, is it save to assume that all content is very dense / close to a region of the world? Or can this be controlled when creating a .pbf? If there was a spatial loose relationship, it would allow to relate FileBlocks to map „tile“ regions (a FileBlock may obviously relate to several „tiles“, but would be fine as long as the blocks relate to a certain region for most of it’s content)
>> There is a commented BBOX definition within the PrimitiveBlock. What remains to be done to to enable this proposed BBOX extension? I’d have the same question about this BBOX as with my second question.
> PBFs are generally ordered by type then ID, so there is no guaranteed spatial clustering. There is a strong correlation between nearby IDs and objects being near each other which makes delta encoding worthwhile.
> A lot of software implicitly depends on ordering. Sorting by type is often a hard requirement - doing anything with ways normally requires having parsed all the nodes for geometries. Sorting by ID may be needed depending on how storage algorithms were implemented - software can become less efficient or break if it's expecting ordered IDs and gets unordered.
> dev mailing list
> dev at openstreetmap.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the dev