[OSM-dev] New, faster, planet dump tool

Jon Burgess jburgess777 at googlemail.com
Mon Sep 24 23:35:38 BST 2007


I've just added a C implementation of the planet.rb script into SVN. The
new code is approximately 10 times faster in my tests:

$ time ./planet.rb > /tmp/ruby

real    11m32.474s
user    10m9.822s
sys     1m2.279s


$ time ./planet > /tmp/new

real    0m50.213s
user    0m29.832s
sys     0m9.879s


These tests were done on a small database with data imported from a UK
planet.osm file from a few months ago (about 1GB of uncompressed XML).

The only difference in the output between the two tools is that the
<tag> elements are occasionally in a different order. A diff between the
Ruby and C for one object looks like:

   <node id="8583156" lat="53.1960850" lon="-2.7614882" timestamp="2006-06-24T19:23:43+01:00">
-    <tag k="place" v="village" />
     <tag k="name" v="Tarvin" />
+    <tag k="place" v="village" />
     <tag k="created_by" v="JOSM" />
   </node>

The ordering produced by the C code matches the
"key=value;key=value;..." pairs in the database. The Ruby script
converts these via a hash which sometimes changes the order. The tag
order is pretty much irrelevant anyway. I've enhanced the planetdiff
tool to ignore this tag ordering.

All the rest of the output is byte for byte identical apart from the
header line which says it was written with planet.c:

<osm version="0.3" generator="OpenStreetMap planet.c">

Hopefully we can start using this for the future planet dumps after it
has gone through a little more testing.

	Jon








More information about the dev mailing list