[Talk-us] Fwd: Fixing TIGER street name abbreviations

Dale Puch dale.puch at gmail.com
Sat May 12 12:08:14 BST 2012


But Serge IS testing.  He is building and testing a script that is
conservative and gives feedback on what it isn't sure about.  That info can
be used for future scripts, or manual edits.
He is only editing ways originally imported by tiger, based on the tiger
prefix, suffix, road type, and base name.  If all were rev 1 unedited
imports it would work darn well.  The testing is mostly for bad tiger data
and subsequent edits that confuse things.

Your just saying it cant/shouldn't be done, and he is figuring out a way to
make it work correctly.  Let him do his tests and provide results, and then
you can try and find the faults in that rather than telling him to not
try.  He isn't running this on the live DB so why not encourage/help him
produce better results.  If there are things you know that will cause
problems, provide real examples of it so he can edit and test the script to
handle them or gracefully skip those.


On Sat, May 12, 2012 at 6:51 AM, Anthony <osm at inbox.org> wrote:

> The examples are contrived because we're not testing.  We're pointing out
> why this is a bad idea.  Using real world examples would just encourage
> people to fix those examples and ignore the fact that the process is wrong.
>
> Anyway, you realize that the road type doesn't always appear after the
> base name, right?
>
> ---------- Forwarded message ----------
> From: *Serge Wroclawski*
> Date: Friday, May 11, 2012
> Subject: [Talk-us] Fixing TIGER street name abbreviations
> To: Dale Puch <dale.puch at gmail.com>
> Cc: talk-us at openstreetmap.org
>
>
> On Fri, May 11, 2012 at 4:17 PM, Dale Puch <dale.puch at gmail.com> wrote:
> > I understand the script checks for only one instance of the abbreviation.
>
> > My point was what is someone manually expanded ONE of the abbreviations,
> > leaving "st something street"?  Is that checked for?
>
> I have a number of thoughts here:
>
> 1.  Real world examples.
>
> Many of the examples I've seen are contrived. I'm glad we're testing,
> but testing needs to be based on actual data seen in the US dataset.
>
> That said:
>
> 2. There are a couple of ways to handle this:
>
> * One way (the most conservative way) would be to test for untouched
> TIGER ways. That is ways in which they're still at version 1. This
> would be a real problem, though, since there are lots of examples were
> someone may have fixed the geometry without touching the tags.
>
> * The other way is a method I'm using in an experimental branch of the
> code on my machine, which is to try to be a bit more selective about
> the expansions of road types. If we assume that the road type always
> appears after the base name, we can be handle examples like (real
> world example) "St Marys St". The same would hold true for direction
> tags, so we'd be able to expand "E E St" confidently as well.
>
> But there's a catch. If someone would have edited the name of the
> above street from the original "St Marys St" to "St. Marys St" then
> that test would fail, and the expansion would never occur, where as in
> the current version, it would.
>
> So:
>
> 3. Any method used is going to produce some number of potential either
> false positives or false negatives. I contend that the number of
> errors in either case will be so tiny that it will be lost in the
> noise, but there's no way to promise it will always be 0. The best we
> can do is toss out uncertain expansions and have them handled manually
> (which is something I'm working to make better in the next version of
> the code as well).
>
> But:
>
> 4. I don't want us to rely on cleverness. I'd much rather rely on
> people testing the code with real world inputs and checking the
> outputs.
>
>
> I should have a new version of the code either tonight or tomorrow,
> with the new expansion rules.
>
> - Serge
>
> _______________________________________________
> Talk-us mailing list
> Talk-us at openstreetmap.org
> http://lists.openstreetmap.org/listinfo/talk-us
>
>
> _______________________________________________
> Talk-us mailing list
> Talk-us at openstreetmap.org
> http://lists.openstreetmap.org/listinfo/talk-us
>
>


-- 
Dale Puch
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-us/attachments/20120512/52eb8ce9/attachment.html>


More information about the Talk-us mailing list