[Imports-us] Address and Building Data Conflatation

Elliott Plack elliott.plack at gmail.com
Mon Jul 15 19:48:34 UTC 2013

Hey all,

I cannot allot the time to read through all of this at work. But, I'm going
to want to do a similar project in my county, which has about 288395
address points. Perhaps I can join the import discussion later.

On Mon, Jul 15, 2013 at 3:39 PM, Serge Wroclawski <emacsen at gmail.com> wrote:

> On Mon, Jul 15, 2013 at 3:14 PM, Brian H Wilson <brian at wildsong.biz>
> wrote:
> > Pulling this off the NYC topic and onto my rural project -- ignore if you
> > are only interested in NYC!
> >
> >
> > On 07/15/2013 08:08 AM, Serge Wroclawski wrote:
> >>
> >> 1. How to conflate the NYC Data for building and address data.
> >>
> >> My understanding of the address data is that it should be in the form
> >> of point data, where the building data is polygons.
> >
> >
> > When I was working on this a few weeks ago the message I got was that
> > addresses should be always be attached to buildings as tags and that
> there
> > should not be separate address points. I am now confused. I spent a lot
> of
> > time trying to come up with a good way to attach addresses to individual
> > buildings. Maybe that was a waste of time?
> You're missing the context of that quote. The "should be in the form
> of point data" is in what form NYC would release their data (should
> that actually happen), not a process for OSM.
> That's why I then discussed how to process this into OSM consumable data.
> > Rural areas frequently have many buildings per tax lot. They are all at
> the
> > same address yet having each barn and shed tagged with street number
> still
> > seems wrong to me. So does trying to find an algorithm to pick out which
> > building polygon is the house (123) and which is the granny unit out back
> > (123 1/2) and which is the barn.
> We discussed this on the hangout several weeks ago, and barns and
> other secondary buildings probably shouldn't have addresses on them at
> all.
> > I am beginning to think that trying to pre-process ALL the data into
> perfect
> > shape before an import is a lofty goal but perhaps means no import will
> ever
> > happen because there not many people here and vanishingly few capable
> > volunteers. They are not up to being trained to use JOSM, their eyes
> glaze
> > over in 30 seconds.
> I am not sure what you are saying, but if you are saying that import
> into OSM in a sustainable way- you're right, but what's far worse is
> bad import that has to be cleaned up later.
> > In the past (not OSM) I have dealt with addresses by using the tax lot
> > centroid to create a point layer and then (as time permits) adjusting the
> > point to the appropriate location (sometimes it's center of the primary
> > building, sometimes it's the driveway entrance to the property,
> depending on
> > the local fire department since that's who I am working for.)
> If you have a small number of objects, this may be doable.
> The solution that we thought made more sense was for residences, to
> use the larger structure as the object with the address. We'll assume
> that most people choose to live in the larger structure, and those
> that don't (for example, that have a greenhouse that's larger than
> their home), we can fix later.
> Obviously for commercial property, this is not going to work as well.
> > It still seems better to me to keep the address numbers separated from
> the
> > buildings,  as per your example when you want to separate two entrances
> to
> > the same building or in my case when you want to have a large building
> with
> > 10 addresses. If you put them in as points, and they are not perfect on
> this
> > first pass, then searching on address will still get you close enough to
> > find the front door and later on anyone can edit them to push them into a
> > better position.
> Naked address nodes have their own problems, and those problems are
> worse than the conflation issues, IMHO.
> >
> >> I know our normal process is to check each building polygon and see if
> >> it contains one or more address points. If it's one address point, the
> >> tags will apply to the building. If it's two points, we'll treat each
> >> point as a entrance on the building and tag them separately.
> >
> > What if there are 10 or more addresses for one building? This often
> happens
> > in my area when there are townhouses or apartments and each one has a
> > separate address. (Not just a unit number)
> We can find that out and handle it:
> http://wiki.openstreetmap.org/wiki/Addresses#Buildings_with_multiple_house_numbers
> In fact the examples given cover the exact situation you're envisioning.
> > Doesn't having a mix of points and polygons tagged in the same area make
> > things confusing? Maybe this is just because I am new at OSM, so I am not
> > used to having a single heterogeneous data set.
> I think this is a three part question:
> 1. Is it confusing to me
> 2. Is it confusing to our tools?
> 3. Is it confusing to other mappers?
> It's certainly not confusing to me. I'd rather all the buildings had
> addr:street and addr:housenumber on them, and then we'd be all set. In
> my mind, an address is an attribute of a building.
> It's not confusing to our tools. Potlatch and iD have preset fields
> for addresses.
> And for other mappers- I think naked nodes are confusing because they
> don't correspond to a physical, observable object. Maybe that's a
> difference between the way a GIS person thinks, and an OSM person.
> > In my region I have never seen data with buildings that already have
> > addresses attached.
> Address data is scarce for the whole world in OSM, and that's why
> we're so focused on it.
> > Where I work (OR|CA|WA), buildings are on tax lots and
> > tax lots have addresses.
> We expect anyone, with no external information, to be able to survey
> an area. Tax lots aren't really surveyable, which is why we're
> generally not in favor of including them on OSM.
> > Are there are 5 houses or 50
> > apartments there? It is often ambiguous. It's good enough to get a
> firetruck
> > to the front door of 1060 but would probably take someone who lives in
> the
> > building(s) to accurately place individual points.
> >
> > Another good one that crops up is mobile home parks and vacation parks,
> > where there can be 50 slots in one tax lot, each with a separate phony
> > address. The tax assessor's official address is "123 Main St" but the PO
> > still delivers mail to "5 Sunset Village Homes". This can't be dealt with
> > algorithmically because it requires local knowledge.
> I agree that local knowledge and manual survey are the best form of
> data and what we should ideally be collecting.
> But then I think "A million buildings sure is a lot of buildings" and
> would prefer to start with *something*
> > I probably won't attend the hangout this afternoon because it seemed
> like I
> > bumped someone else out last time who actually needed to be there, and I
> > have not actually had any time to work on the Benton county import
> lately.
> I don't think this week will be as busy. If we continually bump up
> against the limit, I will look into alternatives.
> - Serge
> _______________________________________________
> Imports-us mailing list
> Imports-us at openstreetmap.org
> http://lists.openstreetmap.org/listinfo/imports-us

Elliott Plack
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/imports-us/attachments/20130715/7495e9c3/attachment-0001.html>

More information about the Imports-us mailing list