[Talk-us] Proposal: Sunset ref=* on ways in, favor of relations

Kevin Kenny kkenny2 at nycap.rr.com
Thu Nov 19 14:51:58 UTC 2015

On 11/19/2015 07:13 AM, Paul Johnson wrote:
> Obviously there needs to be a sunrise for consumers to catch up, but 
> eventually the dinosaur needs to be killed.  There's some places where 
> the way has a ref unique to the route, as noted on the ground, that 
> currently isn't easy to map thanks to the associated tag already being 
> used to identify the ref of an entirely different entity (the route 
> relation).  I'm also suggesting that the quicker we do this, the more 
> painless it'll be.
We should probably bear in mind how we got here. There's a natural 
tension here between the different uses of refs, that should be made 

Those who gather data in the field map what they see. I get on the 
freeway near home, and I see signs "Interstate 890 East/NY 7 West." 
(Newcomers get confused all the time about the fact that "east" on the 
one route is "west" on the concurrent one.) At that strictly local view, 
"ref" is a cluster of values -  it's the multiple things that are on the 
signs. That's what a nearsighted mapper sees. In this view, the topology 
of the highway grid is an emergent phenomenon.

Similarly, to the renderer, the question to be asked is, "what shield or 
cluster of shields shall I put on this way?" That's again a purely 
local, nearsighted view, and comes down to "what's on the sign?"

If collecting mapping data from mappers or rendering maps were all that 
we cared about, we'd stop here.  Collect what's on the signs, render 
what's on the signs, we're done.

And that's how OSM got started, and for a while everyone was happy. Some 
people still are, I suppose. But with refs existing just on the ways, we 
lose important capabilities.

We want to be able to route, and generally speaking, I imagine that 
routers will favour staying on the same numbered route. (I could be 
mistaken, they may simply disfavour roads of lower quality or getting on 
and off freeways unnecessarily.) In any case, the traffic engineers 
generally number routes with the idea that they link destinations. A 
driver heading from Schenectady, New York to Bennington, Vermont can get 
fairly simple directions of "follow NY 7 east all the way to the state 
line, where it becomes Vermont 9. Take Vermont 9 the rest of the way 
into Bennington. The fact that NY 7 makes several twists and turns over 
local streets, and is briefly concurrent with NY 2, or I-87, or NY 22 
(and so on) might be noted as "watch out, the highway turns here," but a 
driver could follow the directions just as I gave them. That's why we 
have numbered routes.

And it's entirely sensible to ask questions like, "what cities does NY 7 
visit?" or "what route does the Northville-Placid Trail take for the 
section where it's following the paved roads in Piseco, NY?"or "what are 
the mileages between exits on I-87?" If all we have is the clustered 
text on the signs, then these are the sort of questions that most 
database managers are very poor at answering, because there's really no 
alternative than parsing the text of each sign, separating it out, and 
discarding those ways that don't match. Furthermore, such a question now 
comes back with a disconnected bucket of ways, without topology, so the 
program answering the question has to reassemble the ways into a route - 
and it may not even be possible to do so.

With route relations, it becomes simpler. "Troy Road between Crosstown 
Blvd and the Northway interchange is bannered NY 7,"
"The Adirondack Northway is bannered I-87 for its entire length" "The 
Adirondack Northway between exits 6 and 7 is bannered NY 7" become 
discrete facts, that are easily queried either by route or by way.

The disadvantage is that life becomes a trifle more complicated for the 
renderer. Instead of dealing simply with ways that have all the refs in 
one place, there's a subquery to get the cluster of refs that belong to 
a given way. This isn't all that hard, but Mapnik doesn't do it yet. 
Rather than try to get functionality like that into Mapnik, what Phil! 
Gold did with his clustered-shield proof of concept was to update the 
bannered ways with clusters of refs when importing the data, and then 
have the renderer deal with those clusters (with pre-rendered graphics 
for the shields). This approach had the advantage that the changes to 
the Mapnik symbolizers were minimal, but came at the expense of some 
truly weird database logic (made weirder by the fact that it also has to 
incorporate refs on ways).  I use his code, because it was too much work 
to reimplement it, but it's far from ideal.

In any case, any database designer wants to have a given piece of 
information in only one place. If refs are on both routes and ways, 
that's a potential for their becoming inconsistent between the two. 
Starting from the "mapper and renderer" perspective, refs on ways made 
sense, but moving forward, it makes things insane for routers, route 
queries and the like. Assembling reference for routes, by contrast, is 
fairly straightforward to do in the database. A database that can 
actually support GIS (such as PostGIS, SpatiaLite, or Oracle Spatial) 
can surely support subqueries to retrieve the refs. Failing that, 
auxiliary tables that organize refs by way could be invented, but I 
prefer not to go there.

I don't think any formal decision has been made to deprecate refs on 
ways, but all of us looking at the problem agree that they're not a 
sensible approach. Someone who's familiar with database design would 
identify them as not being in even "first normal form," which is the 
among the weakest sets of constraints without which a relational 
database cannot function.

Now we need to get organized on the task list to make it happen. We have 
the right brains here to do it "bottom up." I have some thoughts on it, 
but don't have time to set them down right now. Maybe I'll follow up 
tonight. They break down into, "help the mapper," "handle the data 
import," "help the renderer process concurrent routes," "provide for 
legacy renderers," and "get the existing mess retagged."

73 de ke9tv/2, Kevin

More information about the Talk-us mailing list