[Talk-us-massachusetts] A simple check for addresses before the import

Yury Yatsynovich yury.yatsynovich at gmail.com
Tue Aug 7 01:40:07 UTC 2018


Greg,
I've added csv-files with OSMID for buildings (
https://mega.nz/#!S1UVUQYQ!gcnqpley11s7T1ltty1gvw61oJwEjAusBtD9KVZW8EI) and
points (
https://mega.nz/#!flMH0S4R!tkUMCQiUPkm527-SAfA92UiQ_Hb_N64CiXNcvc0ijaQ) for
which addr:street and actual nearby streets do not match. Again, the
problems might be with either the street names or the values of addr:street
of the points/buildigns.
These points/building can be loaded directly into JOSM using their OSMID
(File - Download object - Object ID), or even in openstreetmap.org (
openstreetmap.org/node/*OSMID*)
As for MassGIS points, they are in shp-file, so opening them would anyways
require some GIS software (JOSM, QGIS or anything else).

Alan,
I've also split the results for MassGIS addresses into counties-towns
(folder with the resulting shp-files:
https://mega.nz/#F!ToFnGI6C!jcjnjc3753w4DcfaSLLAJg), so that anyone
interested can go over one's own town and, besides, it would be easier to
keep track of corrections.
To avoid duplicating efforts, maybe, we can create a shared spreadsheet
with a list of towns (350) and anyone who is working on/corrected a
corresponding county-town can enter a note in a cell next to it? For
instance:
https://docs.google.com/spreadsheets/d/1BRMv2iwsg7ZMUiVwtP9JUD5xO8s98ucfVY_1F3DJDfc/edit?usp=sharing

On Mon, Aug 6, 2018, 1:15 PM Alan & Ruth Bragg <alan.ruth.bragg at gmail.com>
wrote:

> Yuri,
> The 435 lines of your "simple code" certainly produced some great
> information.
>
>  I zipped and downloaded Middlesex, buildings and points
> <https://photos.app.goo.gl/zYVLBoZCftwaDe9j9>
> It's interesting to me that all the files for each set must be downloaded
> in order for JOSM to open the shape file.
>
> I opened all 3 shape files and am reviewing the data, stepping through the
> layers using the carto overlay to orientate myself.
>
> Bedford is pretty clean and I recognize that the OSM database you used is
> from a few days ago. I can see errors where I have recently corrected OSM.
> Simple things like the spelling of a road name.
>
> Do you have a suggestion how I can flag the data that are not really a
> problem so we won't have to review it again when another bump is created?
>
> We're also going to need a way to not step on each others work.
> I'll take care of all the Bedford data.
>
> Alan
>
>
> On Mon, Aug 6, 2018 at 9:29 AM Yury Yatsynovich <
> yury.yatsynovich at gmail.com> wrote:
>
>> Greetings!
>>
>> I've recently written a simple code (see lines 107-202 in
>> https://github.com/yyatsyn/MassGIS-address-import/blob/maste
>> r/import_addresses_work_in_progress.py) that looks for nearest 7 streets
>> for each address point (or each building with address information) and
>> marks this point/building as problematic if neither of names of the 7
>> streets match the addr:street tag value for the point/building.
>> I've done this check for points/buildings that are already in OSM as well
>> as those that are in MassGIS database of addresses.
>>
>> The resulting shape files are stored in https://mega.nz/#F!75M1CAAJ
>> !8r63YpTy3HIACDcAUO4c2g (make sure you download all files with the same
>> names to be able to open the corresponding .shp-file):
>> -- problem_pnt_addr.shp and problem_bld_addr.shp -- have points/building
>> that are already in OSM
>> -- *COUNTY*_problem_mgis.shp -- have points from MassGIS (split by
>> counties).
>>
>> Most of problems with MassGIS are from relatively small mismatches in
>> street names (e.g. MassGIS has addresses with "MEDOUIE CREEK ROAD", while
>> in OSM it is just "MEDOUIE CREEK" or "HELLER WAY" vs "HELLERS WAY" or
>> "TENNESSEE AVENUE" vs "TENNESSE AVENUE").
>>
>> I guess, I may also add some fuzzy matching mechanism to the code (so
>> that "TENNESSEE AVENUE" and "TENNESSE AVENUE" would be considered the same)
>> in order to separate those MassGIS addresses that are definitely located in
>> the wrong places (those MassGIS points for which addr:street is not even
>> somewhat similar to the names of nearby OSM streets) from points that are
>> next to a street with a mis-spelled name.
>>
>> If there are mismatches in names of streets in OSM and MassGIS, how do we
>> figure out which source is right?
>>
>> As far as I know, some OSM apps (MAPS.ME, 7 ways) need addr:street and
>> name of the highway to match exactly in order to convert and properly
>> search over the address data. So, before we continue with importing, shall
>> we correct all mismatches in the existing points/buildings with addr:street
>> and misspelled streets?
>>
>> Best,
>> --
>> Yury Yatsynovich
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-us-massachusetts/attachments/20180806/52a70505/attachment-0001.html>


More information about the Talk-us-massachusetts mailing list