[Talk-us-massachusetts] A simple check for addresses before the import, iteration #2
Greg Troxel
gdt at lexort.com
Fri Aug 10 20:38:07 UTC 2018
Yury Yatsynovich <yury.yatsynovich at gmail.com> writes:
> The purpose of this exercise (match MassGIS points to OSM streets) was to
> find MassGIS points that are obviously mis-placed.
> As it turned out, the MassGIS points might be "mis-placed" either because
> MassGIS data are wrong or (and this second reason so far looks more likely)
> because many streets in OSM do not have names (or have wrong names -- these
> cases need scrupulous checks).
That sounds great. You are making a lot of progress understanding this
dataset's properties.
> So, an easy take-away from this exercise is to add names to unnamed streets
> -- the resulting shp-files give us an idea on what streets in OSM are
> currently w/o names and what names they most likely should have.
If done by a local mapper with some clue and on a think-per-street,
check other sources (L3 parcels, maybe look at signs), that sounds
fine.
> Fuzzy match is used to filter the most severe discrepancies. I wrote the
> code with the exact match first, but it gave us too many points to check
> manually and most of those points were with relatively small discrepancies
> (abbreviations, spelling errors, etc. -- hopefully, these can later be
> corrected automatically).
Great - as long as we are sorting issues by priority, and not thinking
fuzzy is ok, I'm with you.
> For blanks and "'" symbols -- they are a quite frequent reason of
> mismatches: "Miller's" vs "Millers", "Mac Arthur" vs "MacArthur", "Hill
> Top" vs "Hilltop".
I suppose then it's a really interesting question what's right. In all
of these I lean to really figuring out some of them, so that we can
judge whether OSM data, MAD, L3 parcels, or massgis roads is most likely
to be correct.
> The matches were also based on distance. So, if there are "First Street"
> and "First Avenue" in the same town, yet, they are not both within 10
> nearest streets to a given point, they will not be mixed.
Sounds good, but I meant if there is street/avenue confusion, that's a
real issue to be sorted out. But fine to defer it for another pass. I
really would not expect much of this.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 162 bytes
Desc: not available
URL: <http://lists.openstreetmap.org/pipermail/talk-us-massachusetts/attachments/20180810/6b9166d2/attachment.sig>
More information about the Talk-us-massachusetts
mailing list