[Talk-us] Street Naming Conventions

Matthias Julius lists at julius-net.net
Mon Apr 12 20:50:31 BST 2010


andrzej zaborowski <balrogg at gmail.com> writes:

> On 9 April 2010 15:30, Matthias Julius <lists at julius-net.net> wrote:
>> Val Kartchner <val42k at gmail.com> writes:
>>
>>> 3) Prefix, body, suffix is available from the TIGER data, but what about
>>> streets that have already been added (or corrected) by users?  As we've
>>> seen, a bot won't always be able to correctly make these separations (as
>>> in the example of "Southbay" vs. "South Bay" given previously)  How do
>>> we make it so that it meets the goals I've given?
>>
>> I would say:
>> - assemble the name out of the tiger:name_* tags
>> - if that matches the name tag re-assemble the name while expanding
>> tiger:name_direction_prefix and tiger:name_direction_prefix and
>> replace the name tag.
>
> Ok, added the check in r20882 although I'd say the script is useful
> for data from sources other than TIGER too.

That may be.  But, I would start with the easy stuff.  It just
requires a lot more scrutiny and special case handling if you are only
parsing name tags.  All I am arguing is that if you have the
components separate like in the TIGER data then you should simply use
them.  Name suffixes in TIGER are a limited set.  Who knows what 'St'
at the end of a street name can possibly mean.

>
> I don't think that only the direction_prefix/suffix should be
> expanded, basically all name should be the way it is pronounced to
> avoid ambiguity.
>
> The "East Doctor Martin Luther King, Junior Boulevard" is an example
> that I think shows that the direction parts of the name are the least
> of the problems.  On the signage the name appears as E DR MLKjr BLVD
> or similar.

Question is how it appears in TIGER or other imported datasets.  It is
probably impossible to write a script that handles every possible case
correctly.  That's why I think the script should stick to the things
that are unambiguous and leave the rest to humans.  It is far better
to leave a couple of cases untreated than to screw up good data.

Matthias




More information about the Talk-us mailing list