[OSM-dev] OSM Date Formats

Jon Burgess jburgess777 at googlemail.com
Sat Sep 29 12:23:54 BST 2007


On Sat, 2007-09-29 at 11:25 +1000, Brett Henderson wrote:
> Hi All,
> 
> There are a number of ways dates are being represented in osm files 
> causing complexity and performance impact when parsing them.
> 
> JOSM writes dates in this format: "2007-07-22 13:42:29"
> Osmosis writes dates in this format: "2007-07-10T11:32:32.000Z"
> planet.rb writes dates in this format: "2007-02-12T18:43:01+00:00"

> Parsing these files is tricky.  Osmosis actually uses three separate 
> parsers, a custom parser reading the UTC format for speed, a standard 
> xml date parser as a fallback, and a customised JOSM parser as a second 
> fallback for all remaining cases.  

Only 3 different formats. The ruby date/format.rb code tries to parse 11
different formats. Parsing date/time strings is a big pain.


> 
> I'd like to standardise on a common format.  The custom osmosis parser 
> provides almost 10x speed improvement over the generic java xml date 
> parser but only works for a single format (currently the osmosis one), I 
> don't want to write custom parsers for every format combination out 
> there.  I thought that osmosis was going to become the new planet dumper 
> which made the problem go away for me but it appears that's no longer 
> the case with planet.c stepping in.

The string generated by planet.c can be easily modified to fit any
standard. It is currently modelled on planet.rb to make it easier to
verify the consistency with the ruby script.

> I'd like to standardise on a UTC date with Z suffix similar to the 
> osmosis example.  I am willing to remove the millisecond information to 
> make it even shorter if necessary (I'll have to write my own formatter 
> but not a big job).  As a nice side effect this would noticeably reduce 
> the size of the planet.
> 
> Thoughts?

"2007-07-10T11:32:32Z" seems OK to me.

If we want to avoid an enormous planetdiff file for the transition then
I'll need to enhance it to parse the formats instead of treating them as
strings. 

	Jon






More information about the dev mailing list