[Talk-us-massachusetts] A simple check for addresses before the import

Greg Troxel gdt at lexort.com
Tue Aug 7 02:37:22 UTC 2018


On August 7, 2018 1:40:07 AM UTC, Yury Yatsynovich <yury.yatsynovich at gmail.com> wrote:
>Greg,
>I've added csv-files with OSMID for buildings (
>https://mega.nz/#!S1UVUQYQ!gcnqpley11s7T1ltty1gvw61oJwEjAusBtD9KVZW8EI)
>and
>points (
>https://mega.nz/#!flMH0S4R!tkUMCQiUPkm527-SAfA92UiQ_Hb_N64CiXNcvc0ijaQ)
>for
>which addr:street and actual nearby streets do not match. Again, the
>problems might be with either the street names or the values of
>addr:street
>of the points/buildigns.
>These points/building can be loaded directly into JOSM using their
>OSMID
>(File - Download object - Object ID), or even in openstreetmap.org (
>openstreetmap.org/node/*OSMID*)
>As for MassGIS points, they are in shp-file, so opening them would
>anyways
>require some GIS software (JOSM, QGIS or anything else).
>
>Alan,
>I've also split the results for MassGIS addresses into counties-towns
>(folder with the resulting shp-files:
>https://mega.nz/#F!ToFnGI6C!jcjnjc3753w4DcfaSLLAJg), so that anyone
>interested can go over one's own town and, besides, it would be easier
>to
>keep track of corrections.
>To avoid duplicating efforts, maybe, we can create a shared spreadsheet
>with a list of towns (350) and anyone who is working on/corrected a
>corresponding county-town can enter a note in a cell next to it? For
>instance:
>https://docs.google.com/spreadsheets/d/1BRMv2iwsg7ZMUiVwtP9JUD5xO8s98ucfVY_1F3DJDfc/edit?usp=sharing
>
>On Mon, Aug 6, 2018, 1:15 PM Alan & Ruth Bragg
><alan.ruth.bragg at gmail.com>
>wrote:
>
>> Yuri,
>> The 435 lines of your "simple code" certainly produced some great
>> information.
>>
>>  I zipped and downloaded Middlesex, buildings and points
>> <https://photos.app.goo.gl/zYVLBoZCftwaDe9j9>
>> It's interesting to me that all the files for each set must be
>downloaded
>> in order for JOSM to open the shape file.
>>
>> I opened all 3 shape files and am reviewing the data, stepping
>through the
>> layers using the carto overlay to orientate myself.
>>
>> Bedford is pretty clean and I recognize that the OSM database you
>used is
>> from a few days ago. I can see errors where I have recently corrected
>OSM.
>> Simple things like the spelling of a road name.
>>
>> Do you have a suggestion how I can flag the data that are not really
>a
>> problem so we won't have to review it again when another bump is
>created?
>>
>> We're also going to need a way to not step on each others work.
>> I'll take care of all the Bedford data.
>>
>> Alan
>>
>>
>> On Mon, Aug 6, 2018 at 9:29 AM Yury Yatsynovich <
>> yury.yatsynovich at gmail.com> wrote:
>>
>>> Greetings!
>>>
>>> I've recently written a simple code (see lines 107-202 in
>>> https://github.com/yyatsyn/MassGIS-address-import/blob/maste
>>> r/import_addresses_work_in_progress.py) that looks for nearest 7
>streets
>>> for each address point (or each building with address information)
>and
>>> marks this point/building as problematic if neither of names of the
>7
>>> streets match the addr:street tag value for the point/building.
>>> I've done this check for points/buildings that are already in OSM as
>well
>>> as those that are in MassGIS database of addresses.
>>>
>>> The resulting shape files are stored in https://mega.nz/#F!75M1CAAJ
>>> !8r63YpTy3HIACDcAUO4c2g (make sure you download all files with the
>same
>>> names to be able to open the corresponding .shp-file):
>>> -- problem_pnt_addr.shp and problem_bld_addr.shp -- have
>points/building
>>> that are already in OSM
>>> -- *COUNTY*_problem_mgis.shp -- have points from MassGIS (split by
>>> counties).
>>>
>>> Most of problems with MassGIS are from relatively small mismatches
>in
>>> street names (e.g. MassGIS has addresses with "MEDOUIE CREEK ROAD",
>while
>>> in OSM it is just "MEDOUIE CREEK" or "HELLER WAY" vs "HELLERS WAY"
>or
>>> "TENNESSEE AVENUE" vs "TENNESSE AVENUE").
>>>
>>> I guess, I may also add some fuzzy matching mechanism to the code
>(so
>>> that "TENNESSEE AVENUE" and "TENNESSE AVENUE" would be considered
>the same)
>>> in order to separate those MassGIS addresses that are definitely
>located in
>>> the wrong places (those MassGIS points for which addr:street is not
>even
>>> somewhat similar to the names of nearby OSM streets) from points
>that are
>>> next to a street with a mis-spelled name.
>>>
>>> If there are mismatches in names of streets in OSM and MassGIS, how
>do we
>>> figure out which source is right?
>>>
>>> As far as I know, some OSM apps (MAPS.ME, 7 ways) need addr:street
>and
>>> name of the highway to match exactly in order to convert and
>properly
>>> search over the address data. So, before we continue with importing,
>shall
>>> we correct all mismatches in the existing points/buildings with
>addr:street
>>> and misspelled streets?
>>>
>>> Best,
>>> --
>>> Yury Yatsynovich
>>>
>>

It seems like the MAD data comes from the towns and if we find errors there is an address authority per town.  For my town its the Town Clerk, somebody I know.   But streets in my town with discrepancies I can easily visit. 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-us-massachusetts/attachments/20180807/6b385401/attachment-0001.html>


More information about the Talk-us-massachusetts mailing list