[Talk-us] Admin boundaries tied to roads
balrogg at gmail.com
Mon Apr 26 05:14:31 BST 2010
On 24 April 2010 06:33, Alan Mintz <Alan_Mintz+OSM at earthlink.net> wrote:
> At 2010-04-22 13:09, andrzej zaborowski wrote:
> >On 22 April 2010 04:24, Alan Mintz <Alan_Mintz+OSM at earthlink.net> wrote:
> >> At 2010-04-21 17:12, andrzej zaborowski wrote:
> >>>On 22 April 2010 01:18, Apollinaris Schoell <aschoell at gmail.com> wrote:
> >>> > On Wed, Apr 21, 2010 at 3:36 PM, andrzej zaborowski <balrogg at gmail.com>
> >>> > wrote:
> >>> >> Where's damage in that -- is it in that you can now read the name out
> >>> >> without checking the documentation for what that funny string means in
> >>> >> that particular database that is TIGER?
> >> I just had a machine crash as I was trying to find stats, but I'll bet that
> >> at least 90% of the cases are "St", "Ave"/"Av", and "Blvd"/"Bl", with the
> >> occasional "Ln" and "Cir"/"Cr" thrown in. When there's a lone N, S, E, or W
> >> as a prefix to a street name, it's clear to everyone what that means. These
> >> are the same abbreviations that _everyone_ uses every day - children,
> >> adults, businesses, governments, etc.
> >Well, you just gave examples of the obvious ones, I'm not claiming any of
> these are not known. But the list has 672 different forms.
> My point, though, was that we were going to a lot of trouble for a small
> percentage of real-world cases that _might_ (see below) present a problem
> for someone to understand.
Right, but we don't want to be inconsistent or we again have to keep
lists of exception to the "normal" rules in every tool. Even if we
just wanted to document that on the wiki (or elsewhere, really doesn't
need to be wiki) for new mappers, then it would have to say something
like "Don't use abbreviations in name=, except final St in English
speaking countries and Foo in Bar speaking countries and... and.. and
so on..". Let's just avoid this area completely.
> >(but even the easy ones are hard for non-human consumers because St has
> at least three possible meanings, all three quite popular across the db).
> I'm sorry, but as a suffix (i.e. for the regex / St$/), what else does St
> mean but Street?
Sure you can have a regex for every allowed abbreviation, perhaps a
few regexes for some of the more complicated ones like St before names
of saints, and then for every language and every source of data, at
which point you start having to look at the source= tag or other tags
before you can fully interpret name=, because in TIGER data "Stra" at
the end is for "Stravenue" while in other places (nominatim's current
list of abbreviations) "Stra" at the end is for "Straight".
> >> And I will do so again. My problem is mostly that this was done without a
> >> safety net. You clobbered existing data with no easy way to "walk it
> >Well, the way to "walk it back" is pretty easy, all the names can be
> taken from version-1 or reassembled from the tiger tags, so no worries there.
> This doesn't work for streets that were edited by users. Again, my problem
> is that, in thousands of edits, I specifically only expanded, for example,
> the prefix "N" to "North" when it is logically part of the root name. When
> it is logically a housenumber suffix, as it is in the majority of southern
> CA, I left the prefix alone. The road name may have been otherwise edited,
> though (to correct spelling, rename completely, etc.) This was to be used
> in the future when we could agree on a way to correctly separate these
> component parts of the name, as they are and must be in any database to be
> used with routing and street addressing in the real world. To "walk it
> back", we will have to query the history of the way and find the version
> before the bot, to see what was done. It's not just v1, or TIGER, because
> it may have been otherwise edited. It's not even v[last-1] any more because
> there may have been other edits since the bot (I've done many myself).
Well I can provide you a list of the original names before I touched
them with the script along with their id's and versions so you can
check if the name has been edited afterwards, if you need to revert
these edits. Note the edits also contain hundreds if not thousands of
my manual fixes for some frequent typos in TIGER and for some cases of
wrong segmentation into "direction_prefix", "base_name" etc.
> I don't understand. Why do I have to remember them? Am I not capable of
> inferring their meaning? Do I have to infer anything anyway, since they are
> likely to be similar/identical to signage?
You have to if you want to give the name to somebody on the phone or
find a name someone gave you on the phone.
More information about the Talk-us