[OSM-dev] Binary OSM; the first pass encoder

Chris Browet cbro at semperpax.com
Sun Nov 9 16:11:49 GMT 2008


For Merkaartor, I also implemented a Binary file format (see
http://wiki.openstreetmap.org/index.php/Osb_spec_v2)

Key points are:
- Non lossy (it is editor oriented, rather than routing or render)
- avoid tags Key/values duplication (there are 2 tables with unique string,
features only holds pointers)
- Regions + tile based for memory efficiency

There is nowadays no provision fur updates/diff, although the region-based
paradigm allows specific updates.

Note also that, internally, Merkaartor uses signed int32. Unless I'm wrong,
40.000 km / 2^32 = 10mm, which seems good enough with GPSes with a 10m
precision ;-)

Just my 2 cents.

- Chris -

2008/11/9 Stefan de Konink <stefan at konink.de>

> Marcus Wolschon wrote:
> > Stefan de Konink schrieb:
> >> The two problems I see; - In order to allow updates there should be
> >> some form of 'updatable space'. - If this space is not present it
> >> might be good to have one file that contains *all* strings, another
> >> one that contains the rest of the data, and maybe a final
> >> client-side generated index on both of them.
> > I do this with large-strings(Strings with <32 characters are inline)
> > and separate files for nodes, ways, indexes, ...
> > I thought about storing all strings externally but ended up noting
> > that most tags are not very long.
>
> The point is not that tags are long but that all keys are duplicates :)
>
> > My purpose is not minimal bandwith for transmission but a good on-disk
> > - -format, so my metrics are different
> > from yours.
>
> True. Don't forget that my first pass encoding, is like you want it to
> be. Maybe you could skip 'users' and 'timestamps' from the data. This
> would significantly reduce the amount of data.
>
> > No. Attributes that are longer then the 32 bytes are stored in an
> > external file.
> > Nearly all attributes fit in here, so most accesses require no
> > additional seek.
>
> Ok :) Sounds like Paradox :) :) :) Good!
>
> >>> I am trying something similar but with fixed length records and
> >>> back-links from node to way to allow updates to be applied to the
> >>> file. 1.4GB-135MB is nice but you still don't want to download
> >>> 135MB every day to have an up-to-date netherlands-file (let alone
> >>> to do this for the planet).
> >> 135MB (gzip XML) -> 78MB (bzip2 bin)
> >>
> >> We are just looking at the possibilities to binary diff the files,
> >> just to allow partial updates. By XORing them on the source present
> >>  at the user.
> >
> > Interesting. It could make a good download-format.
> > I'm looking forward to seeing this happen. :)
> > How do you intend to handle the boundary-rectangles for diffs
> > if a user does not store all the world on e.g. his small nettop?
> > Use binary-files and diffs per country?
>
> Binary files per country sounds the most reasonable thing to do. The
> other problem that the file only will grow bigger, and get fragmentation
> problems, is something else. So we might have to implement a search and
> reorder every 3 months.
>
>
> >>> I was quite occupied with another open-source-project of mine and
> >>>  switching jobs but now I should have the time to finish
> >>> implementing my own  proposal in code and test it's performance.
> >> :) good luck :) if you want to team up to write the ultimate code,
> >> just send a private mail.
> > I just started implementing the memory-mapped io-code for  my nodes.obm.
> > First I want to get this brainchild of mine going and then  implement
> > a Java-parser
> > for your format for binary-downloads.
> > This could get really fast. :)
>
> At #osm-nl we ar discussing the float -> long thing. I used floats
> because it (obviously) allows more precision, but I agree on some points
> mentioned before.
>
>
> Stefan
>
> _______________________________________________
> dev mailing list
> dev at openstreetmap.org
> http://lists.openstreetmap.org/listinfo/dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20081109/62f56c07/attachment.html>


More information about the dev mailing list