[Imports-us] Fwd: Vermont, U.S. address import
Jared
osm at wuntu.org
Fri Oct 7 14:30:13 UTC 2022
Elliott or Greg,
Can you walk me through a real example so I can understand how you would
identify existing addresses?
Let's take Addison, Vermont for example.
The VCGI e911 dataset has 987 address points in Addison. Here's the data
file:
https://github.com/JaredOSM/vermont-address-import/blob/main/town_e911_address_points/e911_address_points_addison.geojson
When I run an overpass query for all elements in Addison that have a
housenumber or street: https://overpass-turbo.eu/s/1mxX
I find that there are already a total of 142 nodes and ways with address
information OSM.
By looking at the overpass results, I can immediately see that 55 of the
existing OSM elements have a "ref:vcgi:esiteid" Key/Value pair. Without
any further queries, I have a high level of confidence that I can remove
all 55 address points from my import file, as they are not even
worth considering for an automated import. This seems like a safe and
efficient way of eliminating the chance of importing duplicate data.
Obviously the other data points need to be evaluated, but why not remove
the 55 for which I have high confidence?
Thanks for helping walk me through how you would approach it, or explain
why my technique could be flawed.
Jared
On Fri, Oct 7, 2022 at 9:50 AM Elliott Plack <elliott.plack at gmail.com>
wrote:
> Jared,
>
> This looks great! I want to thank you for the due diligence. The process
> looks sound.
>
> I do agree about the source tags on the nodes, they may not be as
> reliable. In my experience I check the editor/history of a node for
> authority and if I saw it was made via an import account, I might hold it
> to a different standard--not a bad thing. If you are concerned about
> downstream querying of previously imported addresses, you can query out
> things from the import using the changeset (keep a record), user, or
> version with overpass. I'd recommend looking at that option.
>
> Otherwise I applaud the effort.
>
> Thanks,
>
> Elliott Plack
>
>
> On Fri, Oct 7, 2022 at 8:33 AM Greg Troxel <gdt at lexort.com> wrote:
>
>>
>> Jared <osm at wuntu.org> writes:
>>
>> > On Thu, Oct 6, 2022 at 8:13 AM Greg Troxel <gdt at lexort.com> wrote:
>> >
>> >> You are going to have to deal witha matching addresses between import
>> >> source and OSM programmatically like in #1 above, once you move beyond
>> >> non-addressed towns. Once you do that, the ref won't help, as it won't
>> >> be 100% reliable. Therefore it's noise.
>> >
>> > I was thinking of using the foreign key for a different use case. I
>> agree
>> > that relying on this key for *overwriting* OSM data does not seem safe.
>> > The scenario I'm thinking about is for NEW addresses that are added to
>> the
>> > VCGI dataset. To determine if a NEW VCGI e911 address exists in OSM,
>> the
>> > "ref:vcgi:esiteid" tag would seem to be very helpful. If an address in
>> OSM
>> > already has that unique esiteid key, then we can be confident that it
>> > should be skipped. If the esiteid does not exist in OSM, then other
>> > signals should be evaluated (housenumber, streetname, lat/long, etc.,
>> but
>> > those can be less precise due to misspellings or slightly different
>> > coordinates.
>>
>> I see where you're going but I think you need to get the fuzzy match
>> right anyway and it's not going to help that much to have a key.
>>
>> > I'd like to hear the negative impact a foreign key causes. There are
>> other
>> > similar foreign keys (eg. wikidata, wikipedia) and I've never found
>> them to
>> > be detrimental to my work, but don't want to cause issues for others.
>> The
>> > 55,000 VT addresses that have been added using the Esri layer in the
>> RapiD
>> > editor include this "ref:vcgi:esiteid" key, and I've found it to be
>> useful.
>>
>> A fair question, and it may be that the RapiD stuff is out of line.
>>
>> I don't think the foreign keys really hurt. I just think that the
>> history is that they are less useful than everybody thinks they are going
>> to be.
>>
>> >> Wow. Are you saying that apartment buildings have coordinates of entry
>> >> doors within the building, or that they are artificially skewed to make
>> >> rendering non-overlapping, or ? Surely Vermont has at least some
>> >> multi-floor apartment buildings that have the same floor design and
>> thus
>> >> multiple units that actually do have the same horizontal coordinates.
>> >
>> > I've asked my contact at VCGI for clarification on how multi-tenant
>> > buildings are addressed. From what I've seen, some multi-tenat
>> buildings
>> > just have one e911 address associated with them. I have seen other
>> > buildings that have multiple addresses, but I've never seen them
>> overlap.
>> > I'll keep a close eye out for this and will see what VCGI has to say.
>> I do
>> > have the VT data in a postgis database, but don't have experience using
>> the
>> > GIS functions, so I'll try it out.
>>
>> Sounds good. There are hard questions about datasets and as you can see
>> my bias is to dig in and address them.
>> _______________________________________________
>> Imports-us mailing list
>> Imports-us at openstreetmap.org
>> https://lists.openstreetmap.org/listinfo/imports-us
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/imports-us/attachments/20221007/2eeb56c2/attachment-0001.htm>
More information about the Imports-us
mailing list