[Talk-us] Address Node Import for San Francisco

Gregory Arenius gregory at arenius.com
Mon Dec 13 09:44:40 GMT 2010


As I've been out of touch here is a sort of omnibus reply to the last couple
of days worth of discussion on SF Addresses.   Thanks for all the ideas and
help.


> I would say this is one of the easy imports, there is not too much harm it
> can create. only problem is to merge it with existing data and make a
> decision which one is better. Since this data is probably authoritative it
> might be ok to replace most of the less accurate data already in OSM.
> For this reason I would drop any of the nodes in case of a conflict but
> rename the tags to something else like sf_addrimport_addr:*
> a survey on the road can check them later and compare with the existing
> addr nodes and decide which one to keep and rename the import tags to the
> real tags
>

 I don't think that the data currently in OSM is less accurate than that of
the import.  The address data currently in OSM in SF is either on a node for
something else like a restaurant or shop or its one of the very few
standalone address nodes that have been entered.

I think that having the data attached to a business is much more valuable
than it is alone and don't intend to over write any of that.

There are only a couple of little areas comprising maybe half a dozen blocks
that have standalone address nodes.  The data thats in there looks like it
has been carefully entered and I don't doubt its accuracy especially because
I've met one of the mappers that did some of it and she knows what shes
doing.

As such I don't really think its worth having a fall back
sf_addrimport_addr: tag for conflicting nodes and would rather just drop the
ones from the dataset I'm importing if they conflict.  I definitely will
though if anything I come across in the data makes me think it would be
worthwhile.

Do spot-check different neighborhoods. In reviewing the San Bernardino
> County assessor's shapefile, I found that housenumbers, ZIP codes, and even
> street names were missing/wrong in some areas I spot-checked. The county's
> response was that this data was of secondary importance to the assessor,
> understandably - as long as they have all the parcels, and the billing
> address for them, the actual postal address of the parcel is not critical
> info.
>

I will spot check different neighborhoods to make sure that they're of equal
quality to the blocks I've checked which have mostly been ones local to
where I live and work.

I've found no reason to think that any of the data is billing addresses for
the parcel instead of the mailing address of the parcel.  I'll keep an eye
out for it though.

As to zip codes I don't plan on putting any in because I haven't found a
source for them that I feel would work.

As for a demo of the data, yeah, an OSM file would be perfect. Also,
> though, I'd keep the previous dataset ID, in case you need to do a
> comparison later.
>

I will definitely post an OSM file once I have something a bit closer to
being import ready.

As to the previous dataset ID, in this case the ObjectID, I'm not
particularly opposed to keeping it I'm just not sure what we'd gain in this
instance and I know there are people who object to having lots of third
party IDs in our database.

In this particular instance I think comparisons between OSM and any future
SFAddress files can be done equally as well using the
addr:housenumber/addr:street combo which should, ideally,  be unique.  As
we'll have a fair number of nodes that aren't imported because the address
is already "taken" by a business or otherwise already in OSM we'd have to
resort to using that sort of matching system anyhow.

I don't agree that the other info can be easily, or accurately, derived.
> Addresses near the borders of those polygons are often subject to
> seemingly-arbitrary decisions. The physical location of the centroid of a
> parcel may not be within the same ZIP, city, and/or county polygons as their
> address. I would include the city and ZIP code.
>

Make sense.  I will include the city but as stated above I don't have the
ZIPs.

Just wanna say that addressing in SF would be awesome :-)


The goal is to make it so that SF is fully routable.  As we have good (not
perfect, but really good)street geometry, junctions, classifications (eg,
primary, residential, etc), oneways and names the main things we're lacking
are addresses and turn restrictions.  Hopefully we'll get addresses from
this import.  As there doesn't seem to be any source for the turn
restriction data I've put up a page on the
wiki<http://wiki.openstreetmap.org/wiki/San_Francisco_Turn_Restrictions>to
help coordinate efforts to map them and put a little dent in it myself
to
start.  Hopefully some more people join in.  I think getting these two
things done will put OSM at a pretty competitive level with any of the
commercial data providers with respect to SF.


> Hopefully this is helpful, as you'll want to import street names that
> actually match those in OSM's view of San Francisco.
>

It is helpful, thank you, especially being able to see where many of the
differences are.


>
> I found some other weird burrs in the data as well, in terms of how it
> arranges addresses stacked on top of one another in tall buildings. Nothing
> that can't be dealt with in an import.
>
>
If those stacks are actual addresses in use at that location I was planning
on leaving them that way.   Do you have other thoughts on how to handle
them?  I know in some instances that they're probably not actually stacked,
eg, they're for different businesses that have locations along the front of
the building but I'm not sure how to deal with that.  In some cases,
especially multi-family residences the "stack" is correct in that its at the
entrance for all of the addresses in the stack.

Thanks,
Gregory Arenius
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-us/attachments/20101213/b008b02b/attachment-0001.html>


More information about the Talk-us mailing list