[OSM-dev] Relations without members?

Brett Henderson brett at bretth.com
Fri Oct 12 08:13:01 BST 2007

The osmosis task (at least the working 0.4 version :-) ) was originally 
written to support reporting on users editing within an area.  As a 
result I didn't take too much care about how I mangled ways at the 
edges, I went for the simplest approach that maintained referential 
integrity while still supporting fast streaming.  However if we're 
looking to do something smarter it will need to be revisited.
I think I've fixed the 0.5 version but have done zero testing so it may 
not work yet.  It will work in a similar manner to 0.4 maintaining 
referential integrity.  It will have some issues with relations though 
potentially dropping required data, could do with some more work.

As for your new requirements, could you define what you would like done 
with each entity type at the area edges (I say area because this applies 
to more general polygon extraction also).  We can then figure out a way 
of building a tool to support them.

Perhaps one way of supporting problem ways is to keep all nodes where 
one or two nodes stray outside the box, but drop all other nodes in 
between the first and last stray node.  This would make the ways run off 
and back onto the map in the correct line.  Not sure how that affects 
any routing algorithms.

As for supporting streaming, that is more difficult.  Classes such as 
SimpleObjectStore will help and are what I used for file-based merge 
sorting of planet sized files where I need to revisit objects many times 
during a sort.  Although SimpleObjectStore is very simple and is not 
indexed in any way, other classes such as ChunkedObjectStore go some way 
towards solving the problem.  A fully indexed object store with random 
lookups will require some more work though, it may even be worth 
thinking about a proper database like berkeley db for this although that 
is not necessary if you're prepared to get your hands dirty and roll 
your own.  Another problem you'll face when doing this is speed, my 
current object stores use java object serialisation which is very slow, 
I'd like to implement a custom serialiser and deserialiser for osm data 
types which should provide huge performance speedups.

Karl Newman wrote:
> On 10/11/07, Frederik Ramm <frederik at remote.org> wrote:
> <snip>
>>> Ways are selected if one or more of its nodes are in the bitset.  The
>>> way is modified to only include nodes that are inside the area.  The
>>> selected way ids are added to a bitset.
>> This is something that the extract-polygon script cannot do - it
>> either gives you the full way or no way. I would actually like to
>> implement that functionality, plus an extra hardcore function that
>> will not only drop some nodes off the way, but insert one border node
>> exactly where the way intersects with the bounding box... ;-)
> <snip>
> I'll throw my thoughts in, too.
> This is an issue I've been investigating actively the last few days.
> I'm looking into creating a program to create auto-routing maps for
> Garmin GPSrs. One of the problems that will need to be solved is a way
> to slice up the planet into sections which will stitch back together
> when they're back on the device. The idea that I had was to use
> bounding boxes, and have a rule such as: ways that extend beyond the
> top or left boundary will retain one node past the top or left
> boundary, and ways that extend beyond the bottom or right boundary
> will not retain any nodes past the boundary. When the sections are
> reunited after conversion to Garmin map format, the ways will then
> link up when adjacent sections are loaded. Frederik, your "synthetic
> node" approach (creating new nodes on the boundary) would work, too,
> but I'm afraid it might create a node too close to another node, which
> could cause a routing error. Plus, how do we choose the id's to give
> those nodes (if we care to use the output for another tool which
> expects normal osm data)?
> Another thing to consider is if a way ventures outside the boundary
> and then back inside the boundary, the way will have to be split into
> at least 2 parts in order to cut out the "outside" portion. This has
> the same id number problem as the synthetic nodes above.
> The problem with the current Osmosis bounding box task (aside from the
> fact that it doesn't work correctly on 0.5) is that it is a strictly
> inclusive filter--any nodes which don't pass the boundary test are
> discarded, then the ways are reconstructed using only nodes inside the
> boundary. This distorts the way data by creating "shortcuts" between
> nodes. Maybe Osmosis needs another bounding box task that will create
> complete ways (similar to what the API returns). I understand that
> will require another pass through the nodes, and then through the
> relations again, so it may be difficult to implement in a "streamy"
> manner. Although, it seems that the required bits are there
> (SimpleObjectStore) so it might not be too bad. If I end up using the
> Osmosis architecture for the basis of my auto-routing GPS map
> generator, I'll have to make at least 2 passes through the data too.
> Karl Newman
> P.S. I have created a Garmin GPS auto-routing map XSLT template based
> on the osmgarminmap template (it creates .mp files suitable for input
> into cGPSMapper). It works great for small OSM files (i.e., 200k) but
> it fails miserably for even 12MB of data--xsltproc just stalls for
> hours. XSL is the wrong tool for this job, anyway, but if anyone wants
> to check it out, the files are at
> http://www.migratingcoconuts.com/pub/osmgarminmap/

More information about the dev mailing list