[OSM-dev] Osmosis Replication Statistics

Brett Henderson brett at bretth.com
Sun Sep 2 15:30:22 BST 2007


Brett Henderson wrote:
> Sounds sensible to me.  I'll continue to use that format for now.  I'm 
> adding a new task (maybe --bounding-polygon to match existing 
> --bounding-box ... --extract-polygon could also work) which will look 
> something like this on the command line.
>
> osmosis --read-xml file=planet.osm --bounding-polygon file=polygon.xxx 
> --write-xml file=australia.osm
>
> If we change the format in the future it should be simple to add a new 
> task argument like fileFormat.  For example.
>
> osmosis --read-xml file=planet.osm --bounding-polygon file=polygon.xxx 
> fileFormat=maproom --write-xml file=australia.osm
>   
I've just uploaded osmosis-0.12. It includes three new tasks:
--read-api for downloading direct from the api (subject to bounding box 
size limitations)
--bounding-polygon for extracting sections of a planet
--report for generating user statistics from an osm file containing user 
attributes (eg. a JOSM file or a file produced by osmosis using the 
--read-api task).

The biggest downside of this release is that the minimum java version is 
now 1.6. The polygon code uses 1.6 features 
(java.awt.geom.Path2D.Double), if this causes too many problems I can 
consider reworking it to use 1.5 as a minimum ... but I'm not keen 
unless I have to.

I've produced a few quick stats on the polygon code running against the 
latest planet. All measurements were taken with timestamp parsing 
disabled. I used polygons obtained from http://www.maproom.psu.edu/dcw/.

Baseline raw read of planet (ie. no polygon extraction).
time osmosis --read-xml file=planet-070829.osm enableDateParsing=false 
--write-null
real 5m58.885s
user 4m48.087s
sys 0m14.386s

Extract Australia/Victoria polygon (165,369 byte polygon file resulting 
in a 12,734,709 byte osm file).
time osmosis --read-xml file=planet-070829.osm enableDateParsing=false 
--bounding-polygon file=australia_v2pts.txt --write-xml file=victoria.osm
real 6m22.995s
user 5m1.898s
sys 0m14.488s

Extract Germany polygon (241,845 byte polygon file resulting in a 
605,176,716 byte osm file).
time osmosis --read-xml file=planet-070829.osm enableDateParsing=false 
--bounding-polygon file=germany2pts.txt --write-xml file=germany.osm
real 16m31.788s
user 14m46.632s
sys 0m25.296s

Extract Australia polygon (2,438,865 byte polygon file).
This was running very slow at the point I killed it.

Relatively simple polygons are very fast. The complex complete Australia 
polygon was very slow. I can probably speed things up considerably for 
small result sets by calculating a surrounding bounding box and applying 
that first.

Hopefully somebody finds it useful.

Cheers,
Brett





More information about the dev mailing list