[OSM-dev] Simplifying data for routing
Nolan Darilek
nolan at thewordnerd.info
Sat Jul 28 15:55:23 BST 2007
Hopefully this doesn't turn into a spin-off of the "dropping
segments" discussion. :) I'm trying to work wit the data model as it
is now, and will refine for changes if or when they happen.
Since writing my last post, I've tweaked my GPS navigation app to use
SQL geometric data types. I'm now storing everything as POIs (single
points not connected to segments and not tagged as a road), paths
(all segments on a way combined into a single linestring) and areas
(which I don't support but which is just short-hand for closed ways
so I can quickly determine all the contained locations a point is in.)
I've been reading about shortest path algorithms, and as they all
seem to work on graphs, I've been thinking about the graph of
unprocessed OSM data and wondering how much post-processing I need to
do to make routing possible. I'm better at grokking concepts than
code, which is why I've not yet looked at the recently-announced JOSM
navigator plugin.
Since I'm dropping most of the OSM nodes, I'm thinking that the
vertices on the graph should be intersections. Part of my current
post-processing involves doing a brute-force check of any newly-saved
Path for any that intersect it. I then create an intersection feature
as a point, linking to all the "branches." As such, I can now find
and describe nearby intersections in terms of how many "ways" they
have and render that description as something like "5-way
intersection: E 51st St. with Duval St. and Guadalupe St."
I have several questions. First, is using intersections for the
vertices of the routing graph the right way (no pun intended) to do
it? I thought about using nodes, but if I understand the model,
there's no reason to expect nodes to be placed at intersections.
Second, I'm still trying to grok a*/Dijkstra algorithms for shortest
path. Seems like several routing apps I've checked out prefer
Dijkstra, but is there any reason for picking it over a*? My
understanding is that a* stops investigating paths that aren't
meaningful, so wouldn't move toward a point to the north by going
south. What am I missing? Is a* impractical for some reason?
Third, I'm not quite sure how costly these algorithms are, and
whether it would be better to calculate intersections on the fly or
do so ahead of time. Ahead of time seems to make the import take much
longer (the TIGER data for Travis County, just over 50 megs, takes
about an hour for me to import sans intersection calculation and
clocks in at just over 20 megs in the database, while intersection
calculation adds another 10 megs and 2 hours.)
Finally, can anyone suggest any database optimizations I might make
to help PostgreSQL cope with a bunch of inserts? I'm using Ruby/
ActiveRecord, so there really isn't a way to dump a bunch of SQL
commands and do an import that way. I'd actually like to pre-
calculate the intersection data as it makes finding intersections
much easier and more syntactically beautiful (Intersection.find
(:all, :conditions => "distance(geometry, '#{self.geometry.as_wkt}')
< 0.0003") vs. a larger function finding all nearby paths, checking
intersection status, creating an intersection geometry then checking
again to handle intersections with more than 4 ways.) I just don't
want users spending three hours importing data first, and since my
initial target was a high-end embedded system (though I'm starting to
think I'll need a tiny subnotebook now) smaller storage requirements
are key. It seems like I should be able to optimize *something* out
of PostgreSQL for this type of environment since it's only being used
by one user and for a specific domain, but I can't find what. Or
would MySQL be a better fit for a lower-spec environment? It'd
definitely simplify installation, not needing to install postgis, but
I couldn't get 5.0 to support the distance function. Is it that
"distance" is under a different name, or is it truly not supported?
Thanks a bunch.
More information about the dev
mailing list