[OSM-dev] Osmosis Replication Statistics
Brett Henderson
brett at bretth.com
Sun Sep 2 15:30:22 BST 2007
Brett Henderson wrote:
> Sounds sensible to me. I'll continue to use that format for now. I'm
> adding a new task (maybe --bounding-polygon to match existing
> --bounding-box ... --extract-polygon could also work) which will look
> something like this on the command line.
>
> osmosis --read-xml file=planet.osm --bounding-polygon file=polygon.xxx
> --write-xml file=australia.osm
>
> If we change the format in the future it should be simple to add a new
> task argument like fileFormat. For example.
>
> osmosis --read-xml file=planet.osm --bounding-polygon file=polygon.xxx
> fileFormat=maproom --write-xml file=australia.osm
>
I've just uploaded osmosis-0.12. It includes three new tasks:
--read-api for downloading direct from the api (subject to bounding box
size limitations)
--bounding-polygon for extracting sections of a planet
--report for generating user statistics from an osm file containing user
attributes (eg. a JOSM file or a file produced by osmosis using the
--read-api task).
The biggest downside of this release is that the minimum java version is
now 1.6. The polygon code uses 1.6 features
(java.awt.geom.Path2D.Double), if this causes too many problems I can
consider reworking it to use 1.5 as a minimum ... but I'm not keen
unless I have to.
I've produced a few quick stats on the polygon code running against the
latest planet. All measurements were taken with timestamp parsing
disabled. I used polygons obtained from http://www.maproom.psu.edu/dcw/.
Baseline raw read of planet (ie. no polygon extraction).
time osmosis --read-xml file=planet-070829.osm enableDateParsing=false
--write-null
real 5m58.885s
user 4m48.087s
sys 0m14.386s
Extract Australia/Victoria polygon (165,369 byte polygon file resulting
in a 12,734,709 byte osm file).
time osmosis --read-xml file=planet-070829.osm enableDateParsing=false
--bounding-polygon file=australia_v2pts.txt --write-xml file=victoria.osm
real 6m22.995s
user 5m1.898s
sys 0m14.488s
Extract Germany polygon (241,845 byte polygon file resulting in a
605,176,716 byte osm file).
time osmosis --read-xml file=planet-070829.osm enableDateParsing=false
--bounding-polygon file=germany2pts.txt --write-xml file=germany.osm
real 16m31.788s
user 14m46.632s
sys 0m25.296s
Extract Australia polygon (2,438,865 byte polygon file).
This was running very slow at the point I killed it.
Relatively simple polygons are very fast. The complex complete Australia
polygon was very slow. I can probably speed things up considerably for
small result sets by calculating a surrounding bounding box and applying
that first.
Hopefully somebody finds it useful.
Cheers,
Brett
More information about the dev
mailing list