[Imports] [Imports-us] New Orleans: importing buildings and addresses

Matt Toups mtoups at cs.uno.edu
Wed Oct 22 21:23:01 UTC 2014


Thanks Jason, these are useful points.

On 10/22/2014 06:59 AM, Jason Remillard wrote:
> Hi Matt,
>
> - It is not clear if including the ids is useful. You should consider
> dropping them.
> - Why only put the id on the buildings, but not the address nodes.

I realize these ID numbers are controversial, I appreciate the thoughts 
from you and the others who spoke up about this question. I am quite 
aware of how imports can bloat the OSM database with useless tags, so I 
agree with the general consensus that we should keep tagging minimal in 
these imports.

I do want a way to keep the data up-to-date. The addresses shapefile is 
updated weekly (most recently on Monday of this week) on data.nola.gov. 
In fact, I only just now noticed that their latest update changes a 
filename which breaks my Makefile in the import scripts. Oops, I'll fix 
that now.

So the address data is definitely being updated often. The city is 
constantly processing building permits, property tax assessments, and a 
number of other routine things which then gets pushed to the GIS system. 
Unfortunately the building outlines are not updated quite so often, 
although there have been updates: I see building outlines in there for 
buildings I know constructed within the past year.

You're right that only putting IDs on buildings and not on address nodes 
was an oversight. Unfortunately there does not seem to be any consistent 
mapping between the building IDs and the corresponding address points. 
So when I merge buildings and addresses, which do I use?

Address points have a "BUILDID" which does not correspond with the 
"OBJECTID" on the building outline. The only thing they have in common 
is a "GEOPIN" which apparently is derived from the coordinates. I'm not 
familiar with this GEOPIN, has anyone here seen this? It doesn't look 
useful to me.

I agree it is possible they might not be useful in the future. But I was 
hoping they might. I will admit that part of the reason I included the 
building ID tag is that I was using the nycbuildings and dcbuildings 
source code as a starting point, and both of those imports included the 
ID tags. (Those IDs are still in OSM today, I also noticed. And only on 
buildings, not address nodes.)

I am not wedded to this idea. I am willing to drop the building IDs from 
the import.

> - Consider figuring out how to capture multiple addresses at the same
> node location, rather than dropping them in favor of the primary
> address.

I'm not sure how to go about this. I think the case where there is one 
building and one address, the behavior is correct (add tags to building 
area). I think the case where there are two (or more) addresses within a 
building outline but in different locations, the behavior is correct 
(add separate nodes in distinct locations).

But the case with multiple addresses in the exact same location? What is 
my alternative? I think generating multiple nodes in the same location 
is a poor choice (and of course will not pass the JOSM validator). Then 
what? Add more tags to the same node? I've seen lots of name_1, name_2 
etc from the TIGER import and I'm not a big fan. Would I create 
"addr:housenumber_2" ?

Fortunately I never have to flip a coin, only one address is considered 
"primary" in the upstream data. So in this case, I go with that one. I 
think it's the best alternative given the situation.

Another good thing is that the uploading is being done by locals who can 
correct errors. I already plan to do this in my own neighborhood, and 
other parts of the city I know well.

> - You might want to double check that you have captured all of the
> abbreviations. It seems like this is always a problem.

Eric Ladner brought this up in more detail, I'll address it in a 
response to his message.

> - Have you down a JOSM verify in places where there are many touching
> buildings? Again, this seems like it is always a problem/tricky issue.

I have done a JOSM verify on several different sections of the city. To 
my surprise there aren't many things caught. In the oldest, densest part 
of our city (the French Quarter) I found a few places where overlapping 
buildings were caught by the validator. But in most parts of the city, 
buildings are detached and don't come close to intersecting.

There are a small number of self-intersecting ways which are easily 
corrected.

> - Do you have any addresses data to conflate with?
>

Using an XAPI query I have extracted all existing ways and nodes in the 
area with addr:housenumber tags. The situation is similar to what I 
outlined with existing building=yes tags.

There are 119 ways and 990 nodes with addresses today. The ways are all 
buildings which I already analyzed and put on the wiki (clustered in a 
few small areas and easily merged). The nodes are mostly POIs that will 
be kept, I'll make sure the workflow docs indicate that uploaders need 
to check for duplicate addresses and remove the imported ones when there 
are already existing. (Does JOSM validator check for two nodes with the 
same addr:housenumber value? It isn't necessary wrong for two different 
nodes to share an address, but I think it would be nice to get a warning.)

I'll also note that of those 119 ways and 990 nodes, 45% were created by 
wegavision (confined to a small part of the French Quarter, which we 
already plan to keep). 21% were created by me. 5% were created by Eric 
Ladner. All other users are under 5%. I will continue to reach out to 
users who have mapped here previously to see if they're willing to help 
with uploading and merging their own data with the import. This method 
has already yielded several volunteers. More eyeballs on the data to be 
imported makes this conflation trivial.

Matt



More information about the Imports mailing list