[Talk-us] Fixing TIGER street name abbreviations

Dale Puch dale.puch at gmail.com
Wed May 9 20:36:41 BST 2012


The error rate is directly related to how much testing and review is done.
1/1,000 is by no means a set error rate for either manual or bot edits.

Reasonably simple grep search and replace will correctly expand the
example.  The default should and can be to not expand unless it meets
specific requirements.
Dr is only expanded to drive if it is at the end of the name, or second to
end and followed by cardinal directions (S, E, W, N ect.) but left alone
(or set to doctor) if nothing is in front of it.  Let the bot get the easy
stuff, and then report on the unknowns for manual edits.

Run the grep on a copy of the DB, and do reports on the changes.  Review
just the changed street names before and after for quality control.  Let
others review it as well.  Once it is ironed out make the changes in the
live DB.  I would guess the error rate after that would be well over
1/1,000,000.

Either way you can get an idea about the edits without doing anything to
the live database.

On Tue, May 8, 2012 at 11:34 PM, Anthony <osm at inbox.org> wrote:

> On Tue, May 8, 2012 at 11:31 PM, Anthony <osm at inbox.org> wrote:
> > "Doctor Martin Luther King Bolevard" is one thing.  "Drive Martin Luther
> King
> > Boulevard" is another.
>
> And if we're going to make so many mistakes (1/1000 means thousands of
> mistakes), I'd rather it just be left as "Dr Martin Luther King Blvd".
>
> Yes, we can't stop people from making mistakes.  But we can refuse to
> allow thousands of mistakes to be added, for the sake of removing
> abbreviations which aren't hurting anyone in the first place.
>
> _______________________________________________
> Talk-us mailing list
> Talk-us at openstreetmap.org
> http://lists.openstreetmap.org/listinfo/talk-us
>



-- 
Dale Puch
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-us/attachments/20120509/a5c5457d/attachment.html>


More information about the Talk-us mailing list