[OSM-dev] Optimal free compression algorithm for OSM XML data
lizard
lizard at furcon.de
Fri May 11 14:33:39 BST 2007
hi,
since i always do stupid stuff, i tryed a quickshot of creating a binary
osm format.
with 5 simple replace-commands in perl i saved ~ 400MB in uncompressed
format.
if we build a "real" bin-format for this (not just replacing tags :) )
and than use a normal compressen an that bin-file, i think we can get it
even smaller. Additionaly it is easy to use in systems with less power,
because it don't have so much overhead like xml.
here are my 2 simple (proove of concept) scripts :)
lizard at lizard-desktop:~/osm$ cat osm2bin.pl
#!/usr/bin/perl
open ($ifp, '<planet.osm') || die $!;
open ($ofp, '>planet.bosm') || die $!;
# write a header with version info
print $ofp 'OSM' . chr(0) . chr(0) . chr(1);
while (<$ifp>)
{
$line = $_;
$line =~ s/^ \<node id=/\x01/;
$line =~ s/ lat=/\x02/;
$line =~ s/ lon=/\x03/;
$line =~ s/ timestamp=/\x04/;
print $ofp $line;
}
close ($ofp);
close ($ifp);
#!/usr/bin/perl
open ($ifp, '<planet.bosm') || die $!;
open ($ofp, '>planet-verify.osm') || die $!;
read ($ifp, $buf, 6); $buf = undef; ## just ignore fileinfo :)
while (<$ifp>)
{
$line = $_;
$line =~ s/\x04/ timestamp=/;
$line =~ s/\x03/ lon=/;
$line =~ s/\x02/ lat=/;
$line =~ s/\x01/ \<node id=/;
print $ofp $line;
}
close ($ofp);
close ($ifp);
have fun, and let me know what u think about this :)
On Thu, 2007-05-10 at 15:53 +0100, Nick Hill wrote:
> Hello Shaun
>
> Thank you for the pointers for Mac users and 7-zip.
>
> I have uploaded a copy of the current planet.osm as 7z, where I have further
> increased compression using bigger dictionary etc.
>
> planet files are at:
> http://planet.openstreetmap.org/
>
> The URL for the current planet.osm in 7z format is:
> http://planet.openstreetmap.org/planet-070509.osm.7z
>
> The new file is 183Mb vs Bzip2 235Mb.
>
\
More information about the dev
mailing list