[osmosis-dev] Announcement of the OSMbin file-format

Wed Feb 4 20:54:10 GMT 2009

Karl Newman wrote:
> On Wed, Feb 4, 2009 at 5:16 AM, <marcus.wolschon at googlemail.com 
> <mailto:marcus.wolschon at googlemail.com>> wrote:
>
>     On Thu, 05 Feb 2009 00:02:37 +1100, Brett Henderson
>     <brett at bretth.com <mailto:brett at bretth.com>>
>     wrote:
>     > Good stuff!
>     >
>     > I'd like to play with the dataset support in particular.  A
>     persistence
>     > mechanisms that doesn't rely on a full database would be very
>     useful.
>     >  From your docs on the Osmosis/DetailedUsage page it looks like you
>     > don't have full dataset support yet.  Is that something you plan
>     to add?
>
>     Yes I do.
>     Random access is already working but streaming access is not
>     implemented
>     yet.
>     I am still thinking about how to
>     create a ReleasableIterator that can return all nodes AND ways AND
>     relations.
>
>     Also the reading of a bounding-box is still memory-bound as I need to
>     know all nodes to know what ways and relations to return. And I need
>     to keep note on what ways/relations I have already returned to not
>     return
>     them twice.
>     Any hints where in osmosis I may look? I guess you had to do similar
>     bookkeeping in other tasks already.
>
>
> Look at the IdTracker interface (core.filter.common) and see how it's 
> used in the AreaFilter task (core.filter.v0_6).
Another class to checkout is 
org.openstreetmap.osmosis.core.customdb.v0_6.impl.DatasetStoreReader 
which provides the dataset implementation for my customdb storage.  It 
inherits much of its functionality from 
org.openstreetmap.osmosis.core.filter.v0_6.impl.BaseDatasetReader (I 
don't know why I put it in that package ...).  As Karl suggests it uses 
IdTracker instances to keep track of all the ids in the bbox.  The 
customdb storage uses a quadtile mechanism for identifying nodes, and 
either quadtiles or node_way relations to identify ways.  That side 
worked reasonably well from memory.  The real performance problems with 
it came when trying to stream out the data associated with the ids.  
Pulling out the data associated with each id required a lot of disk 
seeking which never performed well.  I'm hoping that the osmbin 
implementation will work better there.

Brett