[Talk-us] Address Node Import for San Francisco

SteveC steve at asklater.com
Sun Dec 12 13:15:01 GMT 2010


Just wanna say that addressing in SF would be awesome :-)

Steve

stevecoast.com

On Dec 10, 2010, at 1:29 AM, Katie Filbert <filbertk at gmail.com> wrote:

> On Thu, Dec 9, 2010 at 6:20 PM, Serge Wroclawski <emacsen at gmail.com> wrote:
> On Thu, Dec 9, 2010 at 6:00 PM, Gregory Arenius <gregory at arenius.com> wrote:
> > I've been working on an import of San Francisco address node data.  I have
> > several thoughts and questions and would appreciate any feedback.
> 
> The Wiki page doesn't mention the original dataset url. I have a few concerns:
> 
> 1) Without seeing the dataset url, it's hard to know anything about
> the dataset (its age, accuracy, etc.) 
> 
> This is a real problem with imports- knowing the original quality of
> the dataset before it's imported.
> 
> The project has had to remove or correct so many bad datasets, it's
> incredibly annoying.
> 
> > About the data.  Its in a shapefile format containing about 230,000
> > individual nodes.  The data is really high quality and all of the addresses
> > I have checked are correct.  It has pretty complete coverage of the entire
> > city.
> 
> MHO is that individual node addresses are pretty awful. If you can
> import the building outlines, and then attach the addresses to them,
> great (and you'll need to consider what's to be done with any existing
> data), but otherwise, IMHO, this dataset just appears as noise.
> 
>  
> 
> > Also, there are a large number of places where there are multiple nodes in
> > one location if there is more than one address at that location.  One
> > example would be a house broken into five apartments.  Sometimes they keep
> > one address and use apartment numbers and sometimes each apartment gets its
> > own house number.  In the latter cases there will be five nodes with
> > different addr:housenumber fields but identical addr:street and lat/long
> > coordinates.
> 
> > Should I keep the individual nodes or should I combine them?
> 
> Honestly, I think this is a very cart-before-horse. Please consider
> making a test of your dataset somewhere people can check out, and then
> solicit feedback on the process.
> 
> 
> > I haven't yet looked into how I plan to do the actual uploading but I'll
> > take care to make sure its easily reversible if anything goes wrong and
> > doesn't hammer any servers.
> 
> There are people who've spent years with the project and not gotten
> imports right, I think this is a less trivial problem than you might
> expect.
> 
> 
> > I've also made a wiki page for the import.
> >
> > Feedback welcome here or on the wiki page.
> 
> This really belongs on the imports list as well, but my feedback would be:
> 
> 1) Where's the shapefile? (if for nothing else, than the licnese, but
> also for feedback)
> 2) Can you attach the addresses to real objects (rather than standalone nodes)?
> 3) What metadata will you keep from the other dataset?
> 4) How will you handle internally conflicting data?
> 5) How will you handle conflicts with existing OSM data?
> 
> - Serge
> 
> 
> A few comments...
> 
> 1) San Francisco explicitly says they do not have building outline data. :(  So, I suppose we get to add buildings ourselves.  I do see that SF does have parcels.  
> 
> For DC, we are attaching addresses to buildings when there is a one-to-one relation between them.  When there are multiple address nodes for a single building, then we keep them as nodes. In vast majority of cases, we do not have apartment numbers but in some cases we have things like 1120a, 1120b, 1120c that can be imported.  Obviously, without a buildings dataset, our approach won't quite apply for SF.
> 
> 2) I don't consider the addresses as noise.  The data is very helpful for geocoding.  If the renderer does a sloppy job making noise out of addresses, the renderings should be improved. 
> 
> 3) Having looked at the data catalogue page, I do have concerns about the terms of use and think it's best to get SF to explicitly agree to allow OSM to use the data.
> 
> http://gispub02.sfgov.org/website/sfshare/index2.asp
> 
> 4) If you can get explicit permission, then I suggest breaking up the address nodes into smaller chunks (e.g. by census block group), convert them to osm format with Ian's shp-to-osm tool, and check them for quality and against existing OSM data (e.g. existing pois w/ addresses) in JOSM before importing.  QGIS and/or PostGIS can be useful for chopping up the data into geographic chunks.  This approach gives opportunity to apply due diligence, to check things, and keep chunks small enough that it's reasonably possible to deal with any mistakes or glitches.
> 
> -Katie
> 
>  
> _______________________________________________
> Talk-us mailing list
> Talk-us at openstreetmap.org
> http://lists.openstreetmap.org/listinfo/talk-us
> 
> 
> 
> -- 
> Katie Filbert
> filbertk at gmail.com
> @filbertkm
> _______________________________________________
> Talk-us mailing list
> Talk-us at openstreetmap.org
> http://lists.openstreetmap.org/listinfo/talk-us
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-us/attachments/20101212/b9d94005/attachment.html>


More information about the Talk-us mailing list