[Talk-us] Using TIGER to find missing road segments in OSM after license change

Ian Dees ian.dees at gmail.com
Thu Mar 29 14:45:55 BST 2012


On Thu, Mar 29, 2012 at 8:11 AM, Josh Doe <josh at joshdoe.com> wrote:

> On Thu, Mar 29, 2012 at 8:40 AM, Richard Weait <richard at weait.com> wrote:
> > On Wed, Mar 28, 2012 at 8:48 AM, Ian Dees <ian.dees at gmail.com> wrote:
> >> On Wed, Mar 28, 2012 at 5:47 AM, Josh Doe <josh at joshdoe.com> wrote:
> >
> > [ ... ]
> >>> I was thinking more about using TIGER 2011
> >>> to find roads that seem to be missing in the OSM database. My PostGIS
> >>> skills are nil, but it seems like it should be a fairly trivial query
> >>> to buffer the
> >>> OSM ways and find TIGER segments which don't intersect the buffered
> ways.
> > [ ... ]
> >> I started playing with this last night and ended up with the Chicago
> area
> >> metro extract from Mike and the Cook County TIGER roads data as layers
> in
> >> QGIS. Next up is to play with various queries to find missing roads in
> OSM.
> >> I like the idea of buffer and joining as a start and will probably move
> over
> >> to PostGIS to do that.
> >
> > We used OpenJUMP and the RoadMatcher plugin in the early days of
> > Canadian imports to generate lists of matching or missing roads.
>
> I've actually just converted the conflation JOSM-plugin to use the
> Java Conflation Suite (JCS), which RoadMatcher is based on. I don't
> expect to have RoadMatcher-like capabilities in there for quite a
> while, but I should soon at least be able to find which segments don't
> have a match in OSM based on string similarity (e.g. Levenshtein) and
> curve similarity (e.g. Hausdorff, Frechet).


After loading Cook County TIGER road features and OSM linear features into
PostGIS, I ran a simple query to find how well the roads matched:

SELECT a.name, b.fullname, ST_HausdorffDistance(a.geom, b.geom) as dist
     FROM cook_tiger a, cook_osm b
     WHERE (a.geom && b.geom) AND ST_HausdorffDistance(a.geom, b.geom) <
0.0005
     LIMIT 50

This returned results that made sense (the names matched in all 50 results).

I removed the LIMIT clause and let it run before going to work to see how
many of the TIGER records match existing OSM features.

Next up is building a table of TIGER -> OSM matches and using that to find
TIGER rows that don't have a corresponding OSM feature.

If anyone has any ideas for speeding this up I'd love to hear it. It took
well over a couple hours to run one county. There are a lot of counties in
the US.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-us/attachments/20120329/d1c32ce5/attachment.html>


More information about the Talk-us mailing list