[Routing] Tools for getting from OSM to a routable network

Sun Sep 21 22:16:02 UTC 2014

On 21/09/2014, at 10:58 pm, Nic Roets <nroets at gmail.com> wrote:

> You realize that "various transformations" mean that the routable network looks much different from the original OSM data. That make live updating even harder.

Yes, there’s definitely a trade-off there, and some of the transforms I do (removing duplicate points and Douglas-Peucker simplification for example) are probably pointless - they don’t save much space, and they’re probably better managed by verification than by transformations.

But I can’t see a way around having to, at some stage, get from a “vector picture” representation of the road network to some kind of node-arc representation.  You can do that work in advance, or, as I suspect you’re advocating, in the routing algorithm.

That said, it wasn’t me who started with the live updating idea, I think it’s a very interesting challenge, but not actually the one I have in front of me.

> Few of them realize that they need to be mindful of cache misses. Accessing data in L1 cache is 60 times faster than accessing main memory. Accessing main memory sequentially is 30 times faster than accessing it in random order.
> http://www.elitebastards.com/index.php?option=com_content&id=705&Itemid=27&limitstart=3
> 
> Gosmore avoids "various transformations" and achieves perfectly adequate performance by reducing cache misses. But there is certainly room for big improvements.

Can you expand on where you’ve worked on cache misses?

I store arcs and nodes fairly compactly, and without geometry.  Arcs are a large array of:

struct maps_centrelines_arc
{
  int32_t end_node;
  int32_t length_m, time_s;
};

ordered so that nodes can be:

struct maps_centrelines_node
{
  int32_t first_arc;
  int32_t last_arc;
};

So I just store the forward star and no node locations (no two-way searches or A-star here, sorry).  I could do a little better and only store one cost.  I’m not aware of a good way of ordering the nodes (some kind of space-filling algorithm perhaps?)  I order them so that the intersection arcs are near the feeder road arc, but beyond that it’s just sorting by some hand-wavy criteria so that there’s some diff stability.

With simple intersections, this is compact enough that a node-arc representation of NZ fits into 5.8MB (might squeeze into *L3* cache if there’s nothing else going on :-)).
Unfortunately, once I do one of those annoying transformations and expand all the intersections it takes up more like 25MB.  My intersection expander could probably be a bit more conservative, but I’m not sure that would save much.

Next is i-cache.  Thanks to the relatively straightforward data layout, Dijkstra is also straightforward.  I haven’t measured, but I’d bet the compiled code is fairly small…

…except that I’ve done no work whatsoever on the heap.  It’s a GC-allocated binary heap of non-contiguous objects.  I bet there’s improvements to be made there.