[OSM-dev] Final kinks in osmosis planet dumping
Brett Henderson
brett at bretth.com
Mon Sep 10 01:41:47 BST 2007
Hi All,
I'm trying to tidy up any loose ends in osmosis that prevent it from
being used as a planet dumping tool. The main advantages is has over
the existing planet.rb script are:
* Faster dumping due to less database queries and less cpu usage.
* Inclusion of user attributes against entities.
However, I'd like to avoid breaking too many things downstream or
causing any other avoidable problems. I'm including a number of things
here to initiate some discussion. Feel free to chime in with any
additional things to consider.
**** 1. Different version number.
Osmosis currently writes the following document element:
<osm version="0.4" generator="Osmosis">
This differs from planet.rb which writes:
<osm version="0.3" generator="OpenStreetMap planet.rb">
For comparison, the api returns:
<osm version="0.4" generator="OpenStreetMap server">
Given that osmosis can write osm files in a large number of different
scenarios (ie. it is not just used for planet dumps), I'm hoping to
avoid having to change the generator attribute in different scenarios.
I'm also hoping that a 0.4 version attribute is acceptable in all cases.
**** 2. Lack of bound elements.
planet.rb adds this element at the top of the file:
<bound box="-90,-180,90,180"
origin="http://www.openstreetmap.org/api/0.4" />
osmosis doesn't have the ability to generate a <bound> element at the
top of the file. Can we live without this? It is not a simple problem
to solve and will require some rework inside osmosis to allow data
destination tasks to receive bounding box information from data
generation tasks. Currently the tasks at each end of a pipeline are
fairly independent.
**** 3. Re-ordered attributes.
osmosis is writing element attributes in a different order to planet.rb.
planet.rb writes node attributes in the order id, lat, lon, timestamp.
osmosis writes node attributes in the order id, timestamp, user, lat, lon
I initially chose the osmosis ordering because the id, timestamp and
user attributes are common to all entity type, each entity type then
adds its own attributes at the end. This should be fairly simple to
change though.
**** 4. Additional indenting whitespace.
osmosis is currently using 4 space indenting, planetrb is using 2 space
indenting.
I can change osmosis to use 2 space indenting if it helps reduce file
sizes. Should I drop it to 1 space indenting to further reduce file size?
**** 5. Inclusion of database password on command line.
Currently the only way to provide a database password to osmosis is on
the command line. Presumably this will allow other users on the same
system to see the password (through the use of ps, top, etc). If this
is a problem I'll have to update the database tasks to be able to read a
properties file containing connection information.
**** 6. Minimum of Java 1.6
Dev currently has jdk1.5 installed preventing osmosis from running. The
only code requiring 1.6 is the --bounding-polygon task.
I have two options:
a. Rework osmosis to only require 1.5 (either by temporarily deleting
the polygon task, creating a branch without the polygon task, reworking
the polygon task to use older features, etc).
b. Upgrade dev to jdk 1.6.
My preference is obviously b, but I can look into alternatives if that
isn't feasible or straightforward.
**** 7. Different rounding of lat/lon coordinates.
The java DecimalFormat appears to be rounding numbers in a slightly
different way to whatever planet.rb is using (sprintf?). This only
occurs when a 5 has to be rounded one way or the other. I don't think
this is an issue and short of writing my own decimal formatter I don't
think there's much I can do about this one.
Any feedback on these issues (or others that I've missed) would be
appreciated.
Cheers,
Brett
More information about the dev
mailing list