[Imports] NYC building + address import - to merge or not to merge?

Paul Norman penorman at mac.com
Thu Oct 17 00:57:07 UTC 2013

I've been sitting on this message, running some stats, but here it is

> From: Alex Barth [mailto:alex at mapbox.com] 
> Sent: Monday, October 14, 2013 9:35 AM 
> To: Imports OpenStreetMap.org 
> Subject: [Imports] NYC building + address import - to merge or not to merge? 
> Now there's reason to revisit this decision: the data steward (Colin 
> Reilly from NYC GIS) told me that NYC GIS took great care to place 
> addresses at about where the entrance of the building sits. 

A review of the data shows that this may be true for some buildings and 
addresses but is not true for others. As an example, see 
Some points are near addresses, some are near centroids, and some of 
them are strangely at the back. 

> Here is a comparison of the two options. I'd like to discuss and 
> decide at tonight's imports hangout. 

Note: Alex, Serge and myself discussed the import at length tonight.

> ## Option 1: Merge addresses into buildings where possible 
> ## Option 2: Always keep address points separate

There is some repetition in the two sections, so I'm just going to extract and rearrange points. See http://lists.osm.org/pipermail/imports/2013-October/002275.html for the original text

> a) [points] is the NYC GIS way, making it nicer for GIS folks to use OSM 

GIS folks will have to deal with both so this doesn't really give either 
method an advantage. There *will* be addresses on ways that they will 
have to deal with. Additionally NYC is only part of OSM, so they have to
deal with practices elsewhere anyways.

> a) we lose data [when merging points to buildings]

I'd say the information lost is not significant, given that in many 
buildings the point is just the centroid or a random point within the 
building. It's not consistent enough to rely on for anything, as you've 

> Note: it has been suggested to use the address location information to 
> tag an entrance. Unfortunately the data is not consistent enough to do 
> this. 

> b) Not regarding standing practice, merging addresses into buildings 
> is an exception from the generally applicable method of doing separate 
> address points. 
> b) [merging] makes it harder for NYC GIS to leverage OSM 

How so? Keep in mind that NYC GIS will have to deal with

- Addresses collected manually without any import tags

- Addresses on building ways

- Addresses on building ways where neither the address or building way 
  comes from an import
- Addresses on building ways where OSM has split up a structure 
  differently than they have

For consumers other than NYC GIS, they'll be in the same position of 
multiple styles of mapping.

> a) [Addresses on nodes inside buildings] diverges (but does not 
  violate) common OSM practice 
> a) [Addresses on ways] is how a lot of buildings are done in OSM 

Unfortunately statistics are distorted by imports, but I had a look at similar practices with merging and POIs with name=McDonald's.

Of the 5523 locations which could be merged to building polygons, 3315, or 60% were. There were another 5205 locations which were unable to be merged onto a building, either because there were multiple POIs within the building, or there was no building mapped. The actual percentage may be higher as this I can imagine scenarios where the mapper knew there was another POI in the building but it wasn't mapped.

Results for 80% for name=Walmart and 60% for name=Safeway.

> Note right now we've imported data in both formats :p I'm not worried 
> about this and I'll commit to make sure in the end we're consistent. 

Well, as noted above, we won't be consistent, with manual mapping being 
done both ways.

More information about the Imports mailing list