[OSM-talk] TIGER 101

Robert (Jamie) Munro rjmunro at arjam.net
Fri Dec 1 11:31:00 GMT 2006


Ben Gimpert wrote:
> On Thu, 30 Nov 06 @02:49pm, Schuyler Erle wrote:
>> * On 30-Nov-2006 at  2:18PM PST, Ben Gimpert said:
>>> Every time you encounter a TIGER lat/long point, you'll need to do a
>>> search across the lat/long of already-imported nodes within, say, a 1000
>>> miles.  (Beware of very long, straight rural roads.)  If you find a node
>>> with the same lat/long -- where "same" is some function incoporating the
>>> rural-ness of the area -- then you can reuse that node.
>> That was my point. Since TIGER/Line *is* topological by design, just
>> like OSM, it is definitely possible to reuse nodes out of the box.
>> More to the point, you don't need to keep track of every node in the
>> universe, just every node in a simple TIGER/Line file. Not impossible
>> at all.
> 
> You're assuming that -- for example -- there are no roads that cross
> county borders, no roads that might span more than one FIPS .RT1/2 file.
> I doubt this is the case.

Why don't we just write something that once tiger has finished downloads
chunks of the USA through the API, looks for nodes that are closer than,
say, 10m from each other, and merges them, replacing them with a node at
their average point? If 10m isn't wide enough for some areas of the
country, fine, it doesn't matter, we're still better off than not having
run the process.

Later we could refine the script to use some sort of heuristic to
determine the best threshold for an area, based on the amount of data
around them, for example. So if there's two nodes within half a mile and
no other nodes for 100 miles around, merge the 2 nodes. If we run this
new script, as long as it doesn't produce false positives, it doesn't
matter if the end result isn't perfect. People can fix it later.

I think what is important is that we get all of Tiger into the system
roughly so that people can then work on improving it. Otherwise we are
going to run into problems where someone has manually added a bunch of
roads, and Tiger wants to add the same roads. Unless we write something
very clever, we are going to get all the roads twice, and we won't even
be sure which is the most accurate.

Robert (Jamie) Munro

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 249 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20061201/f16bc758/attachment.pgp>


More information about the talk mailing list