[Talk-us] Street Naming Conventions
Matthias Julius
lists at julius-net.net
Mon Apr 12 20:50:31 BST 2010
andrzej zaborowski <balrogg at gmail.com> writes:
> On 9 April 2010 15:30, Matthias Julius <lists at julius-net.net> wrote:
>> Val Kartchner <val42k at gmail.com> writes:
>>
>>> 3) Prefix, body, suffix is available from the TIGER data, but what about
>>> streets that have already been added (or corrected) by users? As we've
>>> seen, a bot won't always be able to correctly make these separations (as
>>> in the example of "Southbay" vs. "South Bay" given previously) How do
>>> we make it so that it meets the goals I've given?
>>
>> I would say:
>> - assemble the name out of the tiger:name_* tags
>> - if that matches the name tag re-assemble the name while expanding
>> tiger:name_direction_prefix and tiger:name_direction_prefix and
>> replace the name tag.
>
> Ok, added the check in r20882 although I'd say the script is useful
> for data from sources other than TIGER too.
That may be. But, I would start with the easy stuff. It just
requires a lot more scrutiny and special case handling if you are only
parsing name tags. All I am arguing is that if you have the
components separate like in the TIGER data then you should simply use
them. Name suffixes in TIGER are a limited set. Who knows what 'St'
at the end of a street name can possibly mean.
>
> I don't think that only the direction_prefix/suffix should be
> expanded, basically all name should be the way it is pronounced to
> avoid ambiguity.
>
> The "East Doctor Martin Luther King, Junior Boulevard" is an example
> that I think shows that the direction parts of the name are the least
> of the problems. On the signage the name appears as E DR MLKjr BLVD
> or similar.
Question is how it appears in TIGER or other imported datasets. It is
probably impossible to write a script that handles every possible case
correctly. That's why I think the script should stick to the things
that are unambiguous and leave the rest to humans. It is far better
to leave a couple of cases untreated than to screw up good data.
Matthias
More information about the Talk-us
mailing list