[Talk-us] Proposal: Sunset ref=* on ways in, favor of relations

Paul Johnson baloo at ursamundi.org
Fri Nov 20 14:26:42 UTC 2015

On Thu, Nov 19, 2015 at 8:51 AM, Kevin Kenny <kkenny2 at nycap.rr.com> wrote:

> On 11/19/2015 07:13 AM, Paul Johnson wrote:
>> Obviously there needs to be a sunrise for consumers to catch up, but
>> eventually the dinosaur needs to be killed.  There's some places where the
>> way has a ref unique to the route, as noted on the ground, that currently
>> isn't easy to map thanks to the associated tag already being used to
>> identify the ref of an entirely different entity (the route relation).  I'm
>> also suggesting that the quicker we do this, the more painless it'll be.
> We should probably bear in mind how we got here. There's a natural tension
> here between the different uses of refs, that should be made explicit.
> Those who gather data in the field map what they see. I get on the freeway
> near home, and I see signs "Interstate 890 East/NY 7 West." (Newcomers get
> confused all the time about the fact that "east" on the one route is "west"
> on the concurrent one.) At that strictly local view, "ref" is a cluster of
> values -  it's the multiple things that are on the signs. That's what a
> nearsighted mapper sees. In this view, the topology of the highway grid is
> an emergent phenomenon.
> Similarly, to the renderer, the question to be asked is, "what shield or
> cluster of shields shall I put on this way?" That's again a purely local,
> nearsighted view, and comes down to "what's on the sign?"
> If collecting mapping data from mappers or rendering maps were all that we
> cared about, we'd stop here.  Collect what's on the signs, render what's on
> the signs, we're done.

Which also misses some important details as well, since I 444 is not
signed, yet OklaDOT frequently mentions it on it's own state maps and in
traffic reports.  Most reporters are quick to catch this and report what
it's signed as ("75 through the IDL"), or just say "the south and east legs
of the IDL" or "The BA Expressway and the Paul Harvey Expressway downtown",
but occasionally there will be a newbie or intern reporting who will read
it off as 444 and express confusion by this on the air (usually prompting a
correction later in the broadcast as people tweet or text the station with
"everyone knows it's...").

We want to be able to route, and generally speaking, I imagine that routers
> will favour staying on the same numbered route. (I could be mistaken, they
> may simply disfavour roads of lower quality or getting on and off freeways
> unnecessarily.) In any case, the traffic engineers generally number routes
> with the idea that they link destinations. A driver heading from
> Schenectady, New York to Bennington, Vermont can get fairly simple
> directions of "follow NY 7 east all the way to the state line, where it
> becomes Vermont 9. Take Vermont 9 the rest of the way into Bennington. The
> fact that NY 7 makes several twists and turns over local streets, and is
> briefly concurrent with NY 2, or I-87, or NY 22 (and so on) might be noted
> as "watch out, the highway turns here," but a driver could follow the
> directions just as I gave them. That's why we have numbered routes.

This is generally the case, and in some situations, legally the case (it's
not uncommon for towns to require trucks to stay on numbered routes except
for local delivery, for example).

> And it's entirely sensible to ask questions like, "what cities does NY 7
> visit?" or "what route does the Northville-Placid Trail take for the
> section where it's following the paved roads in Piseco, NY?"or "what are
> the mileages between exits on I-87?" If all we have is the clustered text
> on the signs, then these are the sort of questions that most database
> managers are very poor at answering, because there's really no alternative
> than parsing the text of each sign, separating it out, and discarding those
> ways that don't match. Furthermore, such a question now comes back with a
> disconnected bucket of ways, without topology, so the program answering the
> question has to reassemble the ways into a route - and it may not even be
> possible to do so.

And even going by that philosophy of just checking control points fails in
a big way due to shortcomings in the ground truth.  I 95 does not list
Philadelphia(!) as a control city, skipping right over it to New York
northbound and Columbia, Maryland southbound, IIRC.

> With route relations, it becomes simpler. "Troy Road between Crosstown
> Blvd and the Northway interchange is bannered NY 7,"
> "The Adirondack Northway is bannered I-87 for its entire length" "The
> Adirondack Northway between exits 6 and 7 is bannered NY 7" become discrete
> facts, that are easily queried either by route or by way.

And can be easily determined by what boundaries are crossed by the
relation, rather than expecting control cities to be all inclusive or

> The disadvantage is that life becomes a trifle more complicated for the
> renderer. Instead of dealing simply with ways that have all the refs in one
> place, there's a subquery to get the cluster of refs that belong to a given
> way. This isn't all that hard, but Mapnik doesn't do it yet. Rather than
> try to get functionality like that into Mapnik, what Phil! Gold did with
> his clustered-shield proof of concept was to update the bannered ways with
> clusters of refs when importing the data, and then have the renderer deal
> with those clusters (with pre-rendered graphics for the shields). This
> approach had the advantage that the changes to the Mapnik symbolizers were
> minimal, but came at the expense of some truly weird database logic (made
> weirder by the fact that it also has to incorporate refs on ways).  I use
> his code, because it was too much work to reimplement it, but it's far from
> ideal.

Still, it proves that it's something that's possible to do.

> I don't think any formal decision has been made to deprecate refs on ways,
> but all of us looking at the problem agree that they're not a sensible
> approach. Someone who's familiar with database design would identify them
> as not being in even "first normal form," which is the among the weakest
> sets of constraints without which a relational database cannot function.

Right now I'm trying to get some feedback before I go through a formal
draft and proposal for the wiki, this is still in the early stages but I'm
still of the opinion that this is overdue and we should move quickly
towards making route management rational sooner rather than later.  That
said, I get that this is something of a flag day
<https://en.wikipedia.org/wiki/Flag_day_(computing)> event which is why I'm
trying to get people to start thinking about this now rather than just punt
and let the problem get worse as time continues.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-us/attachments/20151120/d21e6333/attachment.html>

More information about the Talk-us mailing list