[Routing] Tools for getting from OSM to a routable network

Tue Sep 23 00:23:39 UTC 2014

Thanks Nic and Christian,

On 22/09/2014, at 8:24 pm, Nic Roets <nroets at gmail.com> wrote:

> I agree with Christian that nodes should ordered with the 2D Hilbert curve (In Gosmore I made the mistake of using a hybrid between Hilbert curve and hashing). I think the improvement is much more than factor 3 when you're routing from New York to LA. But it's certainly dependent on the implementation details.

I’ll definitely have a look an Hilbert curves then.  Christian mentioned them being a bit difficult to implement, but from last time I looked at them (some time ago) and a quick look at Wikipedia, the transform doesn’t seem too tricky.  Am I missing something?  Does special care have to be taken in non-square regions?

> Obviously you should order the nodes according to the Hilbert curve. But you should also order the arcs according to the Hilbert curve (the "start_node" field that was optimized away). Then the last_arc of a node will be the same as first_arc-1 of the next node.

The arcs are ordered that way for the current node ordering - I didn’t make that very clear in the last email - and so would follow an improved node ordering.

> In fact, if you put special delimiting entries in your arc array, you can get rid of your node array. Then your nodes are subsequences of the arc array.

That’s a nice improvement and simplification.  Thanks.  However, as you note later, you lose a compact index of nodes (and as you note, you can work around that!)

> That's what I did in Gosmore. But I kept the geometry and calculate the costs on the fly. So my arc array contains almost everything:
> 
> struct doubleArc { // The actual structure has a different name and the fields are all int32_t
>  int32_t lat, lon;
>  doubleArc *forward_end_node, *backward_end_node;
>  wayType *wayDetails;
> }

Any reason for calculating costs on the fly?  I guess you’re letting users customise their vehicle speed profiles?

I have int32s for what I guess is the same reason: I mmap the data, and I don’t want to require that everything be mapped to the same address every time, so I store offsets from a base address.  I do have the arc geometry available, but in a separate array in the same order.

What, though, are your nodes?  Some ways are longer than a single arc, so I’m guessing (but not assuming) that you split ways at each intersection.  Likewise, some ways end in the middle of what I think of as an arc (for example if there’s a speed limit change), and I stitch these together.  Do you also, or do you accept nodes that have only 2 incident ways?  (I know that “only 2 incident ways” becomes ambiguous if all your arcs are one-way, but I hope my meaning is clear)

> Also note that there a binary(ish) search algorithm to find all the nodes in a given bbox. So there is no need for quad trees.

That’s handy, though I have a kdtree implementation that’s adequate [1].  It needs a bit of work, but does the job.

> To implement Dijkstra, I need some temporary work space. Specifically I need 2 structs for every doubleArc (resembling the states of arriving at the node using either of the arcs), which is allocated as an array. This array is indexed using a direct mapping, so it's entries are also in Hilbert curve order. The virtual size of this array is several GB, but it's an mmap() of /dev/zero, so it's physical size is typically much smaller.

Especially if you’re using the Hilbert curve!  I do much the same - I allocate one array of int32s for node labels, and a second array to maintain predecessor information.  When you’re generating a distance matrix, you don’t care much about extracting the paths, so I don’t bother recording predecessors.  I haven’t profiled that, so I’m not sure it’s worthwhile (I know, profiling should have come first).

> My heaps are arrays of pointers. My experiments showed that the heap stays quite small for long route. It seldom exceeds 200 entries.

Thanks, that’s handy to know.  Again, some time ago I did some measuring of the sizes of heaps while generating routes, but I can’t find the notes on what I learned…

Cheers,
Geoff

[1]  https://github.com/geoffleyland/lua-kdtree