[Talk-us] Fixing TIGER street name abbreviations

Wed May 2 17:01:43 BST 2012

On Wed, May 2, 2012 at 11:08 AM, Chris Lawrence <lordsutch at gmail.com> wrote:
> ISTM this might be a good "mechanical turk" application if there is
> genuine concern that there will be a substantial error rate (my
> point-of-view as a social scientist is that a hypothesized 1/1000
> error rate is pretty darn low, but I can appreciate that some might
> have more exacting standards), either implemented on the web or as a
> JOSM plugin.

I'm already working on a revised script, but;

1) We're not talking about a small number of ways- we're talking about
over a million ways. if we assume it takes 20 seconds per way to
correct (which I think is actually low when you add in factors like
upload times) then it will over five and a half thousand man hours.
This would be a very large undertaking

2) My human error rate estimation of 1/1000 seems entirely reasonable.
Think typos, or misreading. I'm sure we see error rates that high now
in OSM and we find them acceptable. A computer that's acting
conservatively will actually produce far lower error rates!

3) I'm seeing very little resistance to the idea of an expansion
script on this list. There's pretty much universal support for
expansions, especially since it's half done already. The concern seems
to be about the script and error rates. We can (and should) test that-
I suspect we'll find very low errors rates- and we can correct the
errors, either in the script or if they're one-offs, in a post-script
process.

- Serge