[Strategic] Routing
Frederik Ramm
frederik.ramm at geofabrik.de
Sun Mar 6 13:18:12 GMT 2011
Strategic,
after having read the minutes of the latest IRC discussion, I feel
that a few basic points about routing have to be said as not everyone
seems to be clear about them.
1. What do we mean by "doing routing"?
--------------------------------------
* Some people read this as "we run a routing engine on OSMF infrastructure";
* some people read this as "we offer a 'find directions' service on the
www.openstreetmap.org web site".
It is important to note the these two are independent of each other; we
have the following options:
* neither operate a routing engine nor offer directions on the web site
(instead possibly place a link on the web site that says "go to this
excellent MapQuest site if you want directions")
* offer directions on the web site but not operate a routing engine
(e.g. by using MapQuest API, or even allowing the user to choose from
different external offers)
* operate a routing engine but not offer directions on the web site
(e.g. because we view the routing engine as an internal service we run
for our mappers so that they may find bugs better)
* offer directions on the web site *and* operate a routing engine
(likely the web site directions would then be powered by that engine,
but it does not have to be so).
2. Reasons for and against
--------------------------
The following are some obvious reasons for and against each of the
options listed above:
* Directions on web site
- for: makes web site more attractive; shows third parties that OSM is
more than just maps
- against: unclear if we want to be in the business of building cool
end-user facing applications; on its own offers little value-add for
mappers; need to invest work to integrate; danger of attracting
low-quality bug reports
* Operating own routing server
- for: allows the OSM community to define the rules used to build the
routing graph; allows different uses than just route from A to B
- against: needs work to set up and maintain, needs server
infrastructure that costs real money
I am sure there are more reasons than these.
3. Routing for quality improvement
----------------------------------
I have the impression that few members of strategic understand what is
meant be the concept of "having a routing engine could help us improve
quality".
I can think of at least three things here.
a) Just by "playing" around with a trivial routing interface like the
one I set up on routingdemo.geofabrik.de (only fastest automobile
routes; no turn restrictions; no textual result descriptions) people can
and will find and fix errors in the map. To prove this point, see
yesterday's postings on the talk-ca mailing list after I had enabled
routing for Canada on that web site. That simple and playful process
already finds wrong oneway bits, connectivity problems and so on.
We do not need to operate our own routing server to have these benefits,
but it certainly helps; to my knowledge, none of the existing servers
that offer world-wide routing based on OSM data is open source (so we'll
never know how the router computes what it does, and can only guess what
idiosyncrasies in our data lead to a certain result), and since we don't
operate them we're some degrees removed from when and how often data
gets updated etc. - as it were in the concrete Canada example, I could
simply run a manual data update after some problems had been fixed so
that mappers could see the results of their work quickly.
b) A very important bit of any routing engine is the extraction of the
routing graph from OSM data, i.e. the interpretation of OSM data for
routing. This starts with simple questions like "how fast can we assume
to travel on a motorway in Bolivia", but includes also grey areas like
"can the routing engine instruct a motor vehicle to make a 165 degree
turn", or "can the routing engine assume that a pedestrian can cross a
road of type <X> at will", and "what kind of instructions need to be
issued for an intersection with lots of little connecting lanes".
At the moment, everyone who writes a routing engine has to think about
these cases and write code for them, and everyone does it differently.
This means that there are no clear rules on how to interpret OSM data,
and in consequence mappers are not clear about what a router will do if
they map something in a certain way. We can assume (or hope?) that if we
have a routing engine that we control, then that will aid in
quasi-standards forming, just like the common map that we have aids in
forming tagging standards. People aren't expected to "tag for the
router" just as they don't "tag for there renderer" now, but still the
existence of a project-wide routing engine that interprets our data in a
way that can be influenced by the community would probably help a lot in
shaping up our data for routing *and* developing a basic standard for
interpreting it.
c) I can also think of a number of automated quality checks run on top
of a routing engine (for example creating distance matrices). These
could use a standard API (and as such also use a service operated by
someone else), but having our own routing engine would allow us to make
bulk queries more efficient. For example, the routing engine on
routingdemo.geofabrik.de takes less than a millisecond to compute, but
more than 100 milliseconds until it arrives at the client through the
API (because of the creation and parsing of XML messages, gzip
compression, and network latency). So if you have the option of
accessing the routing engine directly (which we could at least in theory
provide were we running our own), such things can be done much more
efficently.
Personally I find these points pretty convincing and reason enough to
want a project-wide proper routing server but then again it is not
something that OSMF necessarily has to do - it could also be done for us
by someone else, a sponsor perhaps.
4. Choice of routing engine
---------------------------
If we are after directions on our web site and do not want to run our
own routing engine then we could simply add any and all routing engines
to our web site; no need for us to spend time choosing a specific one.
If we want to run our own server then, as far as I can see, there are
currently only two Open Source products that can be run on the whole
planet. One is Nic Roet's gosmore engine (PD license), and the other is
the Contract Hierarchies implementation from Karlsruhe University (AGPL
license).
Nic Roets is a bright guy and a genius programmer. He created Gosmore at
a time when few of us thought routing on OSM was even possible, and he
deserves credit for that. The Uni Karlsruhe algorithm has won
competitions, is actively maintained by staff, and is the subject of a
number of academic papers. Although both are Open Source, neither have
until now attracted much contribution from the OSM community. I'll try
to spare you the technical details but because of the algorithm used,
Gosmore can offer more flexibility (fastest/shortest/bike/pedestrian/HGV
etc) than the CH implementation; on the other hand the CH implementation
is very fast (can do something like 1000 queries a minute) and Gosmore
is slower by _at least_ an order of magnitude. Both algorithms also
require preprocessing the data, a step that takes time and resources.
For any sort of mass-processing, my money would be on the Uni Karlsruhe
software because it is so much faster. (It is also the software on
routingdemo.geofabrik.de, and an earlier version of the same code has
been used by Cloudmade when they launched their routing.) I also believe
that if we were to offer directions on the web site, Gosmore alone
wouldn't be fast enough to handle our number of visitors. However we
might choose to run a multi-backend routing where we have one user
interface (included in the rails port) that can talk to different
backends, e.g. it uses the fast Uni Karlsruhe server as long as you just
want fastest route from A to B, and switches to Gosmore when you ask for
the scenic wheelchair-suitable route or so).
Any such decision would however require a thorough analysis of
infrastructure required, to answer questions like "how many servers of
what configuration do we need", "how often could we update the data",
"how many queries can we process". Running two routing engines would not
necessarily double the amount of hardware required because preprocessing
for both could be done alternatingly.
5. Offering an API
------------------
Should we run our own routing server, we can choose to offer a public
API or not. Not offering an API at all would reduce the positive effects
we get from our community members using the server in "new and
unexpected ways"; offering a free-for-all API would probably attract
lots of freeloaders who code it into their iPhone apps and so on which
we don't want either. We could probably find technical measures to make
the API usable only for people with an OSM account, or simply write a
policy that clearly asks people to use MapQuest etc. for anything not
OSM-related.
My own position in all this is a slight "I think we should have our own
routing server" and I'm ambivalent on whether or not we should put it on
the web site. But I can see the reasons against (not a core service,
costs money). The one thing I would have difficulty understanding is if
we went for routing on the web site but no engine of our own - that
would be all show and no substance in my eyes.
Bye
Frederik
--
Frederik Ramm www.geofabrik.de
Geofabrik GmbH Handelsregister: HRB Mannheim 703657
Scheffelstr. 17a Geschaeftsfuehrung: Frederik Ramm
76135 Karlsruhe Tel: 0721-1803560-0
ramm at geofabrik.de Fax: 0721-1803560-9
More information about the Strategic
mailing list