[Imports] [Imports-us] Draft proposal for import of New York State GIS SAM Address Points
Skyler Hawthorne
osm at dead10ck.com
Thu Jan 21 16:40:24 UTC 2021
Also, just another thought to add: I was considering the case of adding multiple address points to the interior of a large building, such as retail buildings, where each point is the unit.
In these cases, the present behavior is to add the full address with all tags to the node, thus we end up with several nodes that all have the same address on it, with only the unit differentiating them.
I was wondering if it wouldn't be desirable to try to detect when all the address points inside a building have the same primary address, and in this case, conflate the primary address tags on the building, and have the points inside be only the addr:unit.
What do folks think of this?
Jan 21, 2021 00:31:58 Skyler Hawthorne <osm at dead10ck.com>:
> Hey guys, just as an update, I haven't yet imported any data into OSM, as I've been working on a few things as a result of our conversation. I have made the following changes:
>
> * It will check all addr:* fields of existing elements. Any that are missing are added, and any that conflict are marked for review in a new node.
>
> * addr:flats is now checked for equivalence when determining whether to skip an address point
>
> * It now checks if a way exists nearby with the same street name, and marks for review if it does not. I figured I'll deal with any noise this makes if it happens
>
> * The program now takes an arbitrary bounding polygon, instead of a box. This was to facilitate more direct usage of the Tasking Manager tasks.
>
> * I've added a script to fetch the coordinates of a particular task's bounding polygon. It can be plugged right into an invocation, e.g. to fetch the bounding polygon of project 231 task 66:
>
> nys-gis-sam-import-rs SAM_Master_Statewide_Database.gdb -o https://overpass.kumi.systems/api/interpreter -t 50 -b $(./fetch_tasker_polygon 231 66) > .local/66.osc
>
> * I've made a HOT Tasking Manager project for Rensselaer County: https://tasks.openstreetmap.us/projects/231. I plan on making the rest of the counties after my initial import period. Unfortunately it seems the Tasking Manager has a max size of 5000 km², so it is not possible to make one project for the whole state.
>
> Regarding my issue with what to do with conflicting address information, I've given it some thought, and I think when I move forward with importing data, unless it's really obvious, when there is no way to tell who is right, I am going to leave conflicting address points there, along with the nysgissam:review tag. The existing data could be right, the state data could be right, or both could be right, but there's no way to know without actually visiting the place on foot. I think I'd like to err on the side of adding potentially useful data.
>
> Regarding the tag variations of "matched" vs "imported" address point IDs, after some consideration, I think the variation may not be adding a whole lot of information. If an nysgissam:nysaddresspointid exists on an element, it's there because either it's a new address point node, or because the element previously existed with the same addr:housenumber and addr:street, but was missing one of the other addr:* fields.
>
> However, one thought I've been wrestling with is what to do about the IDs for large apartment buildings. Because of the 255 character limit on tags, I've been falling back to leaving the points as individual overlapping points when there are too many to fit all the IDs into the nysgissam:nysaddresspointid tag. In practice, it's turning out to be too big even for modestly sized apartment buildings, like 25 units or larger. I'd really like to be able to collapse these points to a single addr:flats when possible, especially when there are hundreds of units. However, I'd also really like to keep all the IDs, as I think this will make updates a lot easier in the future.
>
> So I came up with an idea: when there are too many IDs to fit into a single nysgissam:nysaddresspointid tag, I could "spill it over" into multiple tags, like nysgissam:nysaddresspointid:2/3, etc, until we get all the tags. It's kind of hacky, and may be confusing to mappers what in the world all these things are, but it gets the job done, and still leaves in place a straightforward way to update the data in the future (by querying for an address point ID by using a pattern for the key name).
>
> Does anyone have any feelings against this?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/imports/attachments/20210121/fd38d425/attachment.htm>
More information about the Imports
mailing list