[Talk-us-massachusetts] A simple check for addresses before the import, iteration #2
Jason Remillard
remillard.jason at gmail.com
Sun Aug 12 01:01:33 UTC 2018
Hi,
155 points in Littleton.
The 3 SANDAS POINT points don't match the road name but seem correct.
The 8 CRORY LANE addresses are wrong. There is no Crory lane, the points
are over conservation land.
The 1 LONGFELLOW DRIVE address is on conservation land, it is wrong.
The WHITE HORSE ROAD address seems to be correct, yet doesn't match any
roads.
The 2 WESTVIEW ROAD addresses seem to not be developed yet, paper
addresses.
The COTTAGE WAY addresses seem to be correct, but the road wasn't
developed.
The BOATHOUSE WAY addresses seem to be correct, but the road wasn't
developed.
The rest of the points were errors in OSM, mostly missing roads and roads
that had the wrong name.
Except for VINT LANE (too new), the other points should be fixed in OSM.
Jason
On Fri, Aug 10, 2018 at 12:05 PM Yury Yatsynovich <
yury.yatsynovich at gmail.com> wrote:
> Greetings!
> I've modified my code so that now it does some fuzzy matches between OSM
> streets and MassGIS addresses and marks as problematic only those MassGIS
> point that do not pass this fuzzy match.
>
> Details on the steps implemented for fuzzy matches:
> 1) the code expands abbreviations in OSM streets' names like "Str", "Ln",
> etc. to "Street", "Lane", etc.
> 2) the status parts at the end of the streets' names (like "Street",
> "Road", "Lane") are dropped. So "Sunset Street" and "Sunset Drive" turn
> into just "Sunset"
> 3) the code converts OSM and MassGIS street names to upper case.
> 4) the code removes symbols like ".", "'", "," and blanks
> 5) the code considers similar strings (up to 90% similarity) as the same
>
> E.g., if OSM has "New Miller's Street", while MassGIS has nearby address
> points with "NEW MILLER ROAD", the above mentioned steps will convert the
> streets' names into "NEWMILLERS" and "NEWMILLER" and consider them as the
> same. For more details, please, see
> https://github.com/yyatsyn/MassGIS-address-import/blob/master/import_addresses_fuzzy_match_names_work_in_progress.py
> .
>
> The resulting files are in the folder:
> https://mega.nz/#F!79Ny3KKL!JemAt7yZKSUctrza8QU4Tg
>
> The fuzzy match shows that there are not that many severe problems: around
> 300 points and 400 buildings with addresses in OSM need some attention
> (comparing to 1 and 2K when using exact matches for streets' names), as
> well as, maybe, 5-10 streets per town are found to need corrections after
> being compared to MassGIS (mostly those are the streets without names or
> with some extra words like "Main Street Extension" or "East Main Street" vs
> "Main Street").
>
> I would suggest that we add/correct names of the streets (350 towns, 5-10
> streets in each town -- sounds doable for manual edits), re-run the fuzzy
> matching code again and whatever MassGIS points are marked as problematic
> after that -- will be inspected individually.
>
> Any feedback is more than welcome!
> --
> Yury Yatsynovich
> _______________________________________________
> Talk-us-massachusetts mailing list
> Talk-us-massachusetts at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk-us-massachusetts
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-us-massachusetts/attachments/20180811/95dcb616/attachment.html>
More information about the Talk-us-massachusetts
mailing list