[OSM-dev] History File Formats

Brett Henderson brett at bretth.com
Wed May 11 14:31:00 BST 2011

Hi All,

With the recent discussions around deciding on file extensions (eg. osm,
osc, osh, pbf, etc) it seemed a good time to raise a question I've had for a
while now.

What is the relationship between osmChange (osc) files, and full history
files (osh?) with visible attributes?

When I first created the osc format for Osmosis, I also considered using the
visible attribute approach.  At the time I decided against it for a couple
of reasons:

   - I wanted to avoid confusion about what type of data is contained in a
   - Code re-use and XML schema definition is more precise with osc because
   the node/way/relation elements are always the same without optional elements
   appearing in some formats and not others.
   - At the time I considered the visible flag attribute an aspect of our
   database implementation and not part of the OSM logical data model.

I then used the osc file format to create not only the commonly used
daily/hourly/minutely replication files, but also a complete dump of
database history updated daily (stored in one file per day).

I was slightly surprised then to see the creation of the new full history
files.  Don't get me wrong, I don't have an issue with it and choice is
good.  But I do wonder why we've now gone back to a single massive file
approach which is updated rarely and requires a full download each time when
the existing files allow incremental download of recent changes.  Merging
all files into a single file can be done client-side if required without
much processing overhead.

It leaves me with a few questions:

   - Are the Osmosis-based daily full history extracts even used?  Should I
   disable/delete them?
   - Are the history extracts in an unsuitable format and hence unused?
   - Should Osmosis switch over to using the osh format instead of osc to
   represent change data?
   - Does the full history single file contain additional useful data that
   is required?  Changesets perhaps?

What do people think?

