[OSM-dev] OSM PBF and spatial characteristics of blocks

Andrew Byrd andrew at fastmail.net
Tue Jan 5 18:44:56 UTC 2016


> On 05 Jan 2016, at 19:18, Paul Norman <penorman at mac.com> wrote:
> 
> Both of these seem to be software rather than file format definitions. I had a quick look around the repo, but I couldn't find a format definition except that for standard OSM PBF.

Hi Paul,

They are indeed software, both of which store planet-scale OSM data sets and perform on-demand geographic extracts, and the second one applies minutely updates. I may have misunderstood something, but I was under the impression that the original poster was looking for a way to fetch up to date, geographically contiguous chunks of PBF data for use in generating image tiles.


So these two links are not file format definitions, they are essentially special purpose database systems, but of course each one has its own internal on-disk representation of bulk OSM data, which is distinct from PBF. I would not consider PBF to be an optimal bulk storage format if you intend to perform a continuous stream of minutely updates (delete, move, change tags) on arbitrary OSM entities scattered around the world. Performing random access updates inside PBF files could get awkward. Say for example you need to update a tag on way 142563. Where is that way located geographically, in which file, and what is its position inside that file? Once you locate that file, you’d need to decompress the entire file and scan through it to find the way, and if the edit makes the file block larger than it was before, you’d need to shift and rewrite the rest of the file. 

So I was providing these repos as examples of systems that were intended from the beginning to allow minutely updates and arbitrary PBF extracts. They might need to be completed or adapted, but could provide a starting point.

The second (Java) version that I cited is however part of an OSM handling library, which does specify its own file format called VEX. Like PBF, that is a data exchange format for passing around OSM regions in files. But neither PBF nor VEX format is used to store the bulk, spatially indexed data internally.

Andrew


More information about the dev mailing list