[Talk-us] Admin boundaries tied to roads

Alan Mintz Alan_Mintz+OSM at Earthlink.Net
Tue Apr 27 00:31:07 BST 2010


At 2010-04-25 21:14, andrzej zaborowski wrote:
>Hi Alan,
>
>On 24 April 2010 06:33, Alan Mintz <Alan_Mintz+OSM at earthlink.net> wrote:
> > At 2010-04-22 13:09, andrzej zaborowski wrote:
> > Â >On 22 April 2010 04:24, Alan Mintz <Alan_Mintz+OSM at earthlink.net> wrote:
> > Â >> At 2010-04-21 17:12, andrzej zaborowski wrote:
> > Â >>>On 22 April 2010 01:18, Apollinaris Schoell <aschoell at gmail.com> 
> wrote:
> > Â >>> > On Wed, Apr 21, 2010 at 3:36 PM, andrzej zaborowski 
> <balrogg at gmail.com>
> > Â >>> > wrote:
> > Â >>> >> Where's damage in that -- is it in that you can now read the 
> name out
> > Â >>> >> without checking the documentation for what that funny string 
> means in
> > Â >>> >> that particular database that is TIGER?
> > Â >>
> > Â >> I just had a machine crash as I was trying to find stats, but I'll 
> bet that
> > Â >> at least 90% of the cases are "St", "Ave"/"Av", and "Blvd"/"Bl", 
> with the
> > Â >> occasional "Ln" and "Cir"/"Cr" thrown in. When there's a lone N, 
> S, E, or W
> > Â >> as a prefix to a street name, it's clear to everyone what that 
> means. These
> > Â >> are the same abbreviations that _everyone_ uses every day - children,
> > Â >> adults, businesses, governments, etc.
> > Â >
> > Â >Well, you just gave examples of the obvious ones, I'm not claiming 
> any of
> > these are not known. Â But the list has 672 different forms.
> >
> > My point, though, was that we were going to a lot of trouble for a small
> > percentage of real-world cases that _might_ (see below) present a problem
> > for someone to understand.
>
>Right, but we don't want to be inconsistent or we again have to keep
>lists of exception to the "normal" rules in every tool.  Even if we
>just wanted to document that on the wiki (or elsewhere, really doesn't
>need to be wiki) for new mappers, then it would have to say something
>like "Don't use abbreviations in name=, except final St in English
>speaking countries and Foo in Bar speaking countries and... and.. and
>so on..".  Let's just avoid this area completely.

I'm saying that abbreviations are part of every day life, and locals know 
what to abbreviate and what not to. The only tool I can see having an issue 
is text-2-speech in routing, which has to know how to translate some 
abbreviations, just like existing commercial routing software does.


> > Â >(but even the easy ones are hard for non-human consumers because St has
> > at least three possible meanings, all three quite popular across the db).
> >
> > I'm sorry, but as a suffix (i.e. for the regex / St$/), what else does St
> > mean but Street?
>
>Sure you can have a regex for every allowed abbreviation, perhaps a
>few regexes for some of the more complicated ones like St before names
>of saints, and then for every language and every source of data, at
>which point you start having to look at the source= tag or other tags
>before you can fully interpret name=, because in TIGER data "Stra" at
>the end is for "Stravenue" while in other places (nominatim's current
>list of abbreviations) "Stra" at the end is for "Straight".

How does commercial text-2-speech handle this? I think that forcing 
non-abbreviation just to handle the few cases of unusual names like these 
is unnecessary, not to mention that, if I were to review a street with such 
an unusual abbreviation, I'd probably expand it myself to avoid confusion.

I have to look at the wiki all the time to find out what (sometimes 
peculiar) tag name has been used for a particular feature. Is it really 
that much to expect mappers in a country to define and/or look at the wiki 
very occasionally for just a handful of the most common abbreviations that 
cover >90% of the cases - abbreviations that they already know? They don't 
even have to do that if they don't want to - they can still use the 
unabbreviated form.


>Well I can provide you a list of the original names before I touched
>them with the script along with their id's and versions so you can
>check if the name has been edited afterwards, if you need to revert
>these edits.

Good. We also need to settle on a set of component tags to make best use of 
the information present in those edits -  particularly to separate out 
cardinal directions from those that are really part of the name. Can we 
agree for now that, with appropriate local knowledge, it will be acceptable 
to strip just these prefixes out of the name tag into another tag? Should I 
propose a set of component tags for a (hopefully quick) vote? The suffixes 
and root tags could then be populated at the same time (without stripping 
them from the name).


>   Note the edits also contain hundreds if not thousands of
>my manual fixes for some frequent typos in TIGER and for some cases of
>wrong segmentation into "direction_prefix", "base_name" etc.

Yup - me too.


> > I don't understand. Why do I have to remember them? Am I not capable of
> > inferring their meaning? Do I have to infer anything anyway, since they are
> > likely to be similar/identical to signage?
>
>You have to if you want to give the name to somebody on the phone or
>find a name someone gave you on the phone.

...maybe, in the small percentage of cases where the meaning is not known. 
Even still, if you can't expand it, and simply give it literally (e.g. 
"Something X Y Z"), it would still match the OSM printed map, and likely 
match the street sign, or be familiar to a local cabbie, or at least match 
a street sign of "Something", which will get you to the right place most of 
the time. There would have to be both a "Something XYZ" and a "Something 
ABC" in the same general area for you to get lost. Multiply this by the 
already small percentage of both ABC and XYZ being uncommon abbreviations, 
and you have a really small set.

--
Alan Mintz <Alan_Mintz+OSM at Earthlink.net>





More information about the Talk-us mailing list