[Talk-us] Fixing TIGER street name abbreviations

stevea steveaOSM at softworkers.com
Thu May 10 19:09:32 BST 2012


I support this methodology in the sense of it being "Vet, then set." 
(Vet being a verb which my dictionary says means "make a careful and 
critical examination of something.")

Sure, saying "reasonably simple grep search and replace" is a bit 
vague, but I'm not talking about the specifics of this, or any one 
particular, search, just that doing it to an offline copy and then 
vetting the results (having our community "discuss, agree, disagree, 
improve and finalize") sounds like more of the sort of "community 
consensus workflow steps" that I know are going to produce both 
harmony and great results.

THEN upload (set).

Does this mean I suggest precluding individual edit contributions 
that have not been more-widely vetted?  Of course not:  we do this 
all the time.  But as individuals, we just do it on the small scale. 
It is when we do it on the large scale (as in massive TIGER search 
and replaces) that I'm saying "Vet, then set" should be done.

This project, its data, and its interaction amongst us as individual 
contributors in achieving harmonious consensus can only get better. 
We do a fair-to-good job now, let's make that "largely a great job" 
more so in the future.

SteveA
California



>The error rate is directly related to how much testing and review is 
>done.  1/1,000 is by no means a set error rate for either manual or 
>bot edits.
>
>Reasonably simple grep search and replace will correctly expand the 
>example.  The default should and can be to not expand unless it 
>meets specific requirements.
>Dr is only expanded to drive if it is at the end of the name, or 
>second to end and followed by cardinal directions (S, E, W, N ect.) 
>but left alone (or set to doctor) if nothing is in front of it.  Let 
>the bot get the easy stuff, and then report on the unknowns for 
>manual edits.
>
>Run the grep on a copy of the DB, and do reports on the changes. 
>Review just the changed street names before and after for quality 
>control.  Let others review it as well.  Once it is ironed out make 
>the changes in the live DB.  I would guess the error rate after that 
>would be well over 1/1,000,000.
>
>Either way you can get an idea about the edits without doing 
>anything to the live database.
>Dale Puch




More information about the Talk-us mailing list