[Imports] Update: New York State GIS SAM Address Points Import

Skyler Hawthorne osm at dead10ck.com
Sat Jan 23 03:55:59 UTC 2021


Hello everyone,

This is regarding this import: https://wiki.openstreetmap.org/wiki/New_York/NYS_GIS_SAM_Address_Points_Import

I've gotten my importer program to a state I consider satisfactory enough, so I've started importing addresses in Rensselaer County. So far I've done 4 of the tasks in the Tasking Manager:

https://tasks.openstreetmap.us/projects/231/

As of this writing, I've imported or updated a total of 2962 addresses. If you care to see some of the data, you can look at some here:

https://overpass-turbo.eu/s/12H1

So far by doing it manually like this, I found a bug in the logic: previously, if it could not find any existing address with all 3 of addr:housenumber/street/unit/flats, it would look for something with *only* the addr:unit/flats, and skip if it was found. However, in practice, it turns out that this was leading to false positives when there are lots of buildings near each other that just so happen to share the exact same addr:unit/flats values. It was flagging things for review because of data I had just imported! So for now, I've just removed the last part of the check; it will no longer fall back to finding things with just the addr:unit/flats. After doing some searching, I was actually only able to find a handful of examples in New York where people had mapped flats/units like this, so I don't think it will cause significant problems to remove this check.

The other thing I'm seeing many occurrences of is discrepancies between the street names in the New York State data vs Tiger. There were also more than a few instances of the street name from Tiger having shortened names like "St Josephs Ct." I've been cross-referencing the existing OSM street names from Tiger with the online map viewer for NYS Streets, and whenever I see a discrepancy, I've been favoring the state data and correcting the street name.

Other issues with the data I'm seeing include things like duplicate POIs inside a building that both have the full address information.

As I've said in earlier posts, at this early stage, I'm mostly looking for bugs in my importer, and for larger issues with the data that would be good to consider when weighing importing the whole data set en masse automatically. With the exception of the addr:units/flats matching issue that has been fixed, so far the issues I've seen don't seem like huge problems. I'll keep everyone updated as I do more of these manually.

--
Skyler
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/imports/attachments/20210122/ac61c86f/attachment-0001.htm>


More information about the Imports mailing list