[OSM-talk] TIGER 101

Ben Gimpert ben at somethingmodern.com
Thu Nov 30 16:36:08 GMT 2006


Hi Andy & Steve,

Right, nodes are not being reused.  This is just because of how they are
represrented they are represented in the TIGER data itself.  I've posted
about this before -- how "5th Avenue" in Manhattan is actually N
separate record chains in TIGER, though obviously it's a
physically-contiguously street.  And that this occurs also at the node
level, where there is no level of abstraction in the TIGER dataset for
shared nodes.

Take a look at:

	http://svn.openstreetmap.org/utils/tiger_import/tiger/tiger.rb

...and note how the TIGER files present (only) raw lat/long, at every
scale (node, street, point-of-interest).

We have to remember that the TIGER data is just a *very* rough first
step to a usable streetmap of America.  I'm sure companies like MapQuest
have had to spend enormous money and/or effort to clean things up for
their driving directions.

I wrote some code to intelligently try to merge roads with the same
names that share (roughly) an end node-or-two.  This code didn't scale
well outside of Manhattan, since FIPS counties can be very strangely
shaped and sized.  (See Steve's 'blog post on Gerrymandering for a
similar topic...  Heh.)

As for re-using nodes, we face a similar problem of scaling:  At what
lat/long precision do we consider two points the "same"?  (Say 0.00005
of a degree, or what?)  Again, answering this question is hard across
the entire (HUGE) country.

We might define a formula based upon the smallest rectangle that can
cover a county.  Say, (maxCountyLongitude - minCountyLongitude) / 10^5,
but this, umm, doesn't work.  (I tried already.)

Let me know via email if someone else wants to take a crack at writing
code to "merge nodes" (and streets) in the TIGER data.  I myself won't
be able to write any more code for OSM since I'm bogged down in other
responsibilities.  Especially given OSM's wiki nature, I feel like a
routing system will have to have some intelligence about assuming two
"nearby" nodes are really the same intersection / bend / whatever.

Hope this make sense!

		Cheers,
		Ben


On Thu, 30 Nov 06 @03:20pm, Andy Robinson wrote:
> Ben,
> 
> Looking at the San Francisco data newly imported it still appears that ways
> are being added without connection to adjacent ways, ie there is no sharing
> of common nodes. See JOSM screen dump where I have selected and dragged a
> way. It should have tugged the adjacent ways too.
> 
> http://ajr.hopto.org/osm/tiger-nodes.png
> 
> Cheers
> 
> Andy
> 
> Andy Robinson
> Andy_J_Robinson at blueyonder.co.uk 
> 
> >-----Original Message-----
> >From: talk-bounces at openstreetmap.org [mailto:talk-
> >bounces at openstreetmap.org] On Behalf Of Ben Gimpert
> >Sent: 30 November 2006 2:53 PM
> >To: talk at openstreetmap.org
> >Subject: [OSM-talk] TIGER 101
> >
> >Hi OSM,
> >
> >The TIGER -> OSM import is (again) kicked off and going.
> >
> >Since there is now disaster recovery logic atop a MySQL tracking
> >database, a proper status report is possible.  Unless there are any
> >objections, I intend to commit the status report to the OSM SVN
> >repository every night at 3am:
> >
> >		http://svn.openstreetmap.org/utils/tiger_import/status
> >
> >Dig the tiny-but-increasing numbers for CAlifornia, around our first
> >prioritized counties (for Mikel).
> >
> >		Cheers,
> >		Ben
> >
> >
> >_______________________________________________
> >talk mailing list
> >talk at openstreetmap.org
> >http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/talk
> 




More information about the talk mailing list