[Talk-us] Address Standard
val42k at gmail.com
Fri Aug 13 06:35:20 BST 2010
On Thu, 2010-08-12 at 14:54 -0600, Kevin Atkinson wrote:
> I think that these components should be automatically separated by parsing
> the street name some how, and only require manual entry when there is
> ambiguity. When there is ambiguity, I think just entering in the Street
> Name (base type in tiger) will be enough.
However, the directional prefixes can't be automatically parsed out.
Just as some examples, from your home of Salt Lake City, there is North
Temple Street, South Temple Street and West Temple Street. These are
the actual names. A complete address would be something like 150 West
North Temple Street.
Further north, around Ogden, there are actually two streets named North
Street, one South Street, and (one I found today) a South Pointe Street.
There is also an East Crest and West Crest. In each of these cases, the
seeming directional prefix is part of the street name and not a prefix.
One North Street could have East or West directional prefixes (for
addresses), but for various reasons the others wouldn't.
Names like "South 3300 West" could automatically be parsed. There would
be other patterns that we could find that would always work. There
would be many others that couldn't automatically be parsed. We will
need a solution that is easy to enter.
What about using separators, like the standard semicolon. Any names
without separators (and not an "always works" pattern) would need to be
manually reviewed. So the above example would become
"South;3300;West;Street". This separates out the parts of the street
name into, in this case, directional prefix, street name, directional
suffix, type of street. They wouldn't need to be in any specific order
since there would only be a limited set of strings that could be in the
fields other than "street name".
Some of the other examples above would become "South Pointe;Street" and
"West;North Temple;Street". These work because "South Point" isn't one
of the known fields like "South". However, "North;Street" and
"South;Street" wouldn't work with this scheme, so we'll need something
beyond this simple idea.
Also, the renderers seem to be VERY slow at catching up to changes like
this. (They're still arguing about how to handle route numbers
separated by semicolons.) Would there be a way for a 'bot to monitor
street name changes and parse something like the above idea, separate it
into appropriate key/value pairs, then fix up the regular "name" field
to a standard format? Then the 'bot could check other changes to a
"name" field and flag it for manual review.
Okay, shoot these ideas full of holes, just as long as we make progress.
- Val -
More information about the Talk-us