[OSM-dev] Final kinks in osmosis planet dumping

Brett Henderson brett at bretth.com
Mon Sep 10 01:41:47 BST 2007


Hi All,

I'm trying to tidy up any loose ends in osmosis that prevent it from 
being used as a planet dumping tool.  The main advantages is has over 
the existing planet.rb script are:
* Faster dumping due to less database queries and less cpu usage.
* Inclusion of user attributes against entities.

However, I'd like to avoid breaking too many things downstream or 
causing any other avoidable problems.  I'm including a number of things 
here to initiate some discussion.  Feel free to chime in with any 
additional things to consider.

**** 1. Different version number.
Osmosis currently writes the following document element:
<osm version="0.4" generator="Osmosis">
This differs from planet.rb which writes:
<osm version="0.3" generator="OpenStreetMap planet.rb">
For comparison, the api returns:
<osm version="0.4" generator="OpenStreetMap server">

Given that osmosis can write osm files in a large number of different 
scenarios (ie. it is not just used for planet dumps), I'm hoping to 
avoid having to change the generator attribute in different scenarios.
I'm also hoping that a 0.4 version attribute is acceptable in all cases.

**** 2. Lack of bound elements.
planet.rb adds this element at the top of the file:
<bound box="-90,-180,90,180" 
origin="http://www.openstreetmap.org/api/0.4" />

osmosis doesn't have the ability to generate a <bound> element at the 
top of the file.  Can we live without this?  It is not a simple problem 
to solve and will require some rework inside osmosis to allow data 
destination tasks to receive bounding box information from data 
generation tasks.  Currently the tasks at each end of a pipeline are 
fairly independent.

**** 3. Re-ordered attributes.
osmosis is writing element attributes in a different order to planet.rb.

planet.rb writes node attributes in the order id, lat, lon, timestamp.
osmosis writes node attributes in the order id, timestamp, user, lat, lon

I initially chose the osmosis ordering because the id, timestamp and 
user attributes are common to all entity type, each entity type then 
adds its own attributes at the end.  This should be fairly simple to 
change though.

**** 4. Additional indenting whitespace.
osmosis is currently using 4 space indenting, planetrb is using 2 space 
indenting.
I can change osmosis to use 2 space indenting if it helps reduce file 
sizes.  Should I drop it to 1 space indenting to further reduce file size?

**** 5. Inclusion of database password on command line.
Currently the only way to provide a database password to osmosis is on 
the command line.  Presumably this will allow other users on the same 
system to see the password (through the use of ps, top, etc).  If this 
is a problem I'll have to update the database tasks to be able to read a 
properties file containing connection information.

**** 6. Minimum of Java 1.6
Dev currently has jdk1.5 installed preventing osmosis from running.  The 
only code requiring 1.6 is the --bounding-polygon task.
I have two options:
a. Rework osmosis to only require 1.5 (either by temporarily deleting 
the polygon task, creating a branch without the polygon task, reworking 
the polygon task to use older features, etc).
b. Upgrade dev to jdk 1.6.

My preference is obviously b, but I can look into alternatives if that 
isn't feasible or straightforward.

**** 7. Different rounding of lat/lon coordinates.
The java DecimalFormat appears to be rounding numbers in a slightly 
different way to whatever planet.rb is using (sprintf?).  This only 
occurs when a 5 has to be rounded one way or the other.  I don't think 
this is an issue and short of writing my own decimal formatter I don't 
think there's much I can do about this one.


Any feedback on these issues (or others that I've missed) would be 
appreciated.

Cheers,
Brett





More information about the dev mailing list