[Imports] Proposed import of San Francisco addresses

Belov, Charles Charles.Belov at sfmta.com
Fri Feb 10 05:24:10 UTC 2023


Thank you to everyone for your input. Alas, based on your comments, it appears the database I seek to use may not be a suitable choice for the intended purpose of an import to OpenStreetMap.

Chief issues are inconsistency as to where the lat/longs point to; difficulty automatically detecting buildings with more than one address per building, particularly if two street names are involved; and at least one case of a wildly inaccurate lat/long out of six lat/longs I looked at.

Details follow:

In response to people's various responses:

On Thu, Feb 9, 2023 at 6:48 PM Matthew Whilden <matthew.whilden at gmail.com> wrote:

> Are there any buildings that have more than one addr:street in addresses for that building? Or addr:unit 
> numbers for different addr:housenumbers? This happened for the Indianapolis data set. Delimiting works 
> great until you have to delimit more than one field. Maybe this doesn't happen in practice in your area. 

> However, it's also nice to have addresses on nodes for buildings with sizable footprints because you can then 
> get the node closer to where the address goes (ex: strip mall). I don't know how your dataset handles that. 
> The Indy dataset had addresses more or less where each tenant would be located. 

The building on the southeast corner of Market & Van Ness has three addresses: 1525 Market Street (the bank branch on the ground floor), 11 South Van Ness Avenue (the SFMTA Customer Service Center), and 1 South Van Ness Avenue (the office lobby on the first floor, as well as the entire basement and 2nd and higher floors, which are also above or below the bank branch and the SFMTA Customer Service Center) . 

In the City data from https://data.sfgov.org/Geographic-Locations-and-Boundaries/Addresses-Enterprise-Addressing-System/3mea-di5p:

1525 Market Street has lat/long of 37.77514082, -122.4189106. This points to the entrance.
1 South Van Ness Avenue has lat/long of 37.75385644, -122.4161123. Bad data. See below.
11 South Van Ness Avenue has lat/long of 37.77450185, -122.4188237. This points to the entrance.

Since the building is in the same place for each of these, it's clear that the lat/long does not refer to the building, but to the respective entrances. 

It appears that the lat/long given for 1 South Van Ness Avenue is wrong; Google Maps is giving me a location at South Van Ness and 23rd Street, over a mile away from the actual South Van Ness and Market. If we can't trust the data, then bulk importing is not going to work. (I've reported the 1 South Van Ness issue to the DataSF dataset owner.)

The building on the northwest corner of Geary and Presidio has at least four addresses, 949 Presidio Avenue, and 2620, 2630 and 2640 Geary. 949 Presidio is both behind and above the other addresses.

949 Presidio: 37.78312314, -122.4460813 points to the entrance.
2620 Geary is not in the data. I'm not necessarily concerned with missing addresses as they will at least do no harm.
2630 Geary: 37.78285206, -122.4463834 points to within the building.
2640 Geary: 37.78283298, -122.4465738 points to within the building.

I'm thinking that there might be a way to run a sanity check on lat/longs within a particular hundred block, for example that 1 South Van Ness is reasonably close to 11 South Van Ness (which in this case the data isn't even though the addresses are close in real life). San Francisco generally has 100 numbers per block counting from the beginning of the street. We might need to separate odd and even, as on the 700 block of Duboce.

That said, the inconsistency as to whether an entrance or a building is used sounds like it would be problematic with regard to OpenStreetMap's practices.

> On Thu, Feb 9, 2023 at 6:22 PM Tyler Brown Cifu Shuster <t at fust.us> wrote:
> Buildings with multiple addresses use semicolons to separate their addresses in the field, like any field with 
> multiple values. 

This would handle the simple case, if we have a way to detect it. Many buildings in San Francisco have more than one address. Two-unit houses will often have one address for each unit, for example 191 Example Street and 193 Example Street. 

Tyler Brown Cifu Shuster wrote:

> The most important thing to note is that addresses should be applied to buildings and not as points. 

> The 
> "Materials (inputs, code, output)” link on the import page doesn’t work so I can’t see how they’re being applied.

I didn't post that file. It was apparently posted to that page three years ago. I don't know why it might be protected. I've requested access and will update the list if I gain it.

> Please let me know if you’d like any help with the application of house numbers to buildings in bulk as I’ve done 
> this elsewhere.

Thank you! For further discussion.

On Thu, Feb 9, 2023 at 5:34 PM Jack Pearson <jack at pearson.onl> wrote:

> Be careful when you do this automatically. Address nodes are often beside the building they describe 
> rather than inside the building. But when you add any sort of tolerance for that error you can introduce a lot 
> of false positives. If there are 2 physical buildings A and B, and only B has an address, and only A has been 
> traced into OpenStreetMap, and the B address node has just a little error, then the address node will bind 
> to the wrong building.

> I've made this mistake before and affected many buldings, which is why I'm commenting.

Noted. This is beyond my knowledge to handle.

On Thu, Feb 9, 2023 at 5:27 PM Eliot <ewblen at e.email> wrote:

> An import that creates duplicate data must not happen.

Agreed, even as I'm not sure how to accomplish this.

> The most obvious solution to this is to remove all addresses that are already in OSM from your import 
> dataset. I.e. import only the missing addresses.

Again agreed. Tedious and necessary.

> The process of finding existing addresses that are wrong (location,  name, number etc) and fixing them 
> could be considered as a separate project.

Again agreed.

> Does this imply that every building has at most one address?  An OSM object can't have more than 
> one address.


> How are units/apartments within multi-unit buildings addressed?

Unit numbers that don't have unique street addresses beyond the unit number are outside of the scope I seek.

Hope this helps,

Charles Belov
he/him
Acting Webmaster 
Communications & Marketing





San Francisco Municipal Transportation Agency
1 South Van Ness Avenue, 3rd floor
San Francisco, CA 94103





More information about the Imports mailing list