[Talk-us] Admin boundaries tied to roads
andrzej zaborowski
balrogg at gmail.com
Thu Apr 22 21:09:10 BST 2010
On 22 April 2010 04:24, Alan Mintz <Alan_Mintz+OSM at earthlink.net> wrote:
> At 2010-04-21 17:12, andrzej zaborowski wrote:
>>On 22 April 2010 01:18, Apollinaris Schoell <aschoell at gmail.com> wrote:
>> > On Wed, Apr 21, 2010 at 3:36 PM, andrzej zaborowski <balrogg at gmail.com>
>> > wrote:
>> >> Where's damage in that -- is it in that you can now read the name out
>> >> without checking the documentation for what that funny string means in
>> >> that particular database that is TIGER?
>
> I just had a machine crash as I was trying to find stats, but I'll bet that
> at least 90% of the cases are "St", "Ave"/"Av", and "Blvd"/"Bl", with the
> occasional "Ln" and "Cir"/"Cr" thrown in. When there's a lone N, S, E, or W
> as a prefix to a street name, it's clear to everyone what that means. These
> are the same abbreviations that _everyone_ uses every day - children,
> adults, businesses, governments, etc.
Well, you just gave examples of the obvious ones, I'm not claiming any
of these are not known. But the list has 672 different forms.
(but even the easy ones are hard for non-human consumers because St
has at least three possible meanings, all three quite popular across
the db).
> And I will do so again. My problem is mostly that this was done without a
> safety net. You clobbered existing data with no easy way to "walk it back".
> The existing name value should have been put in a foo_name tag so we could
> at least see what used to be. I would at least encourage that a bot be run
> to find these edits, find the previous version in history, and do this, if
> we can't soon agree on a better schema to split the name up into components
> at the same time.
Well, the way to "walk it back" is pretty easy, all the names can be
taken from version-1 or reassembled from the tiger tags, so no worries
there.
>
>>I don't know who defined the ones used in TIGER but this is not the
>>only way to abbreviate the names, that is proven by USPS having their
>>own list that is not identical. The most popular words will be the
>>same in both lists but some are really cryptic and arbitrary, could as
>>well be numeric codes. Then TIGER also includes Spanish names and the
>>list has abbreviations for those too, which rarely anyone in US can
>>read, while they can cope with unabbreviated ok.
>
> I don't agree. Much of the US speaks Spanish. Many more possess the
> tremendous brainpower and enoUGH grade-school Spanish required to know that
> Cl. in front of a street name might mean Calle or Cam. might mean Camino,
> or that S means Sur and N means Norte.
But do you remember the 600 abbreviations used in tiger? It's neither
practical or useful or helps anyone, they're much like numerical
codes. The one single thing they may be good for is for rendering at
lower zoom levels.
>
>
>
> name: The pre-balrog name
99% percent of the cases this was an arbitrary version of name, taken
from a database which was chosen only on the basis of its license, not
because it was more correct or anything. So I don't see any reason to
hang on to it.
>
>> >> The reason it was done with a script is that doing it manually was
>> >> taking a lot of time and mappers were spending that time doing this
>> >> instead of going out mapping. Â And it's always been on the wiki about
>> >> not using abbreviated names, even when the original import was done,
>> >> ignoring this.
>
> So what most newbies, including myself, did, was to follow the style of the
> majority of the data, instead of the often-outdated, incomplete, and
> inaccurate wiki, which is often not even self-consistent.
The "majority of the data" in this case was an imported dataset that
hasn't even been fully reviewed by a human, so while I agree learning
by example is a good way to make a quick start, it doesn't mean if you
followed the example then it's the only correct way to go.
I'm not using wiki as an argument to tell you what you should do, but
I think it's a good way to see what others were thinking. I have
never edited the Key:name page, and I had never read it before
noticing that using abbreviations in a dataset that is supposed to be
parseable is a recipe for problems.
>
>
> In the Los Angeles area, I rarely saw expanded names (which is why I
> continue to abbreviate), except for those rare instances where someone drew
> a street from scratch before TIGER (apparently), and not even all of those.
>
>
>>You could surely change the wiki but it's a conclusion that a lot of
>>people individually seem to come to so I'm sure you wouldn't even need
>>a bot before someone would add a phrase to that effect.
>
> I don't know about "a lot". I mostly just hear people regurgitate the
> "don't abbreviate" mantra without justification. Admittedly, maybe it's
> because it's already been hashed out to death and I'm late to the party.
> Regardless, maybe I'm not alone, and it deserves some re-thinking.
>
> Do people that are actually mapping (not bulk-importers) really want to
> type in "North Martin Luther King, Junior Boulevard Southwest" and then
> proofread that to make sure they didn't typo anything?
It completely depends on what quality they expect from the resulting
map.. Same way you could argue that if a road is zigzagging then
people should map it as straight because it takes fewer clicks and at
the same time you're not affected by your GPS inaccuracies... but it's
a shortcut.
Cheers
More information about the Talk-us
mailing list