[Talk-us-massachusetts] A simple check for addresses before the import, iteration #2

Jason Remillard remillard.jason at gmail.com
Sat Aug 11 17:36:53 UTC 2018


Hi Yury,

I looked at Groton. 90 exception addresses points total.

5 points didn't match because the address is for a road that doesn't exist,
yet I think they are correct. The town GIS database indicated that the
missing road was a paper road and not built, or it used to exist and was
abandoned. Correct addresses can be on road names that don't exist
apparently!

One point, JONATHAN NUTTING ROAD, had a road name that didn't match
anything. The towns assessor database had a different address that matched
a road name. I think MAD is incorrect.

The other 84 point were errors in OSM. Incorrect road names that originated
from original MassGIS import (use ctrl-H in JOSM) or new roads that are
visible on aerial images.

Of the 90 exceptions, 84 errors in OSM, 1 error in MAD, and 5 exceptions
that are correct.

I fixed the errors in OSM.

Jason





On Fri, Aug 10, 2018 at 12:05 PM Yury Yatsynovich <
yury.yatsynovich at gmail.com> wrote:

> Greetings!
> I've modified my code so that now it does some fuzzy matches between OSM
> streets and MassGIS addresses and marks as problematic only those MassGIS
> point that do not pass this fuzzy match.
>
> Details on the steps implemented for fuzzy matches:
> 1) the code expands abbreviations in OSM streets' names like "Str", "Ln",
> etc. to "Street", "Lane", etc.
> 2) the status parts at the end of the streets' names (like "Street",
> "Road", "Lane") are dropped. So "Sunset Street" and "Sunset Drive" turn
> into just "Sunset"
> 3) the code converts OSM and MassGIS street names to upper case.
> 4) the code removes symbols like ".", "'", "," and blanks
> 5) the code considers similar strings (up to 90% similarity) as the same
>
> E.g., if OSM has "New Miller's Street", while MassGIS has nearby address
> points with "NEW MILLER ROAD", the above mentioned steps will convert the
> streets' names into "NEWMILLERS" and "NEWMILLER" and consider them as the
> same. For more details, please, see
> https://github.com/yyatsyn/MassGIS-address-import/blob/master/import_addresses_fuzzy_match_names_work_in_progress.py
> .
>
> The resulting files are in the folder:
> https://mega.nz/#F!79Ny3KKL!JemAt7yZKSUctrza8QU4Tg
>
> The fuzzy match shows that there are not that many severe problems: around
> 300 points and 400 buildings with addresses in OSM need some attention
> (comparing to 1 and 2K when using exact matches for streets' names), as
> well as, maybe, 5-10 streets per town are found to need corrections after
> being compared to MassGIS (mostly those are the streets without names or
> with some extra words like "Main Street Extension" or "East Main Street" vs
> "Main Street").
>
> I would suggest that we add/correct names of the streets (350 towns, 5-10
> streets in each town -- sounds doable for manual edits), re-run the fuzzy
> matching code again and whatever MassGIS points are marked as problematic
> after that -- will be inspected individually.
>
> Any feedback is more than welcome!
> --
> Yury Yatsynovich
> _______________________________________________
> Talk-us-massachusetts mailing list
> Talk-us-massachusetts at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk-us-massachusetts
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-us-massachusetts/attachments/20180811/0ae066da/attachment.html>


More information about the Talk-us-massachusetts mailing list