[Talk-us] Using TIGER to find missing road segments in OSM after license change
Martijn van Exel
m at rtijn.org
Wed Mar 28 16:00:15 BST 2012
On Wed, Mar 28, 2012 at 6:48 AM, Ian Dees <ian.dees at gmail.com> wrote:
> On Wed, Mar 28, 2012 at 5:47 AM, Josh Doe <josh at joshdoe.com> wrote:
>> I initially just sent this to Ian Dees, but maybe there are others on
>> this list
>> that are thinking of doing this or could help.
>> Considering the upcoming license change and it's impact (many roads
>> that may become missing), I was thinking more about using TIGER 2011
>> to find roads that seem to be missing in the OSM database. My PostGIS
>> skills are nil, but it seems like it should be a fairly trivial query
>> to buffer the
>> OSM ways and find TIGER segments which don't intersect the buffered ways.
>> Hardest part will then be scaling this up to all 3140 counties. Later we
>> continue to utilize the resource by extending this work to progressively
>> more intelligent by splitting ways into two node segments to get more
>> results, and maybe do string matching to highlight name problems. Oh and
>> flagging of erroneous data in TIGER. And maybe stats per county, and ...
>> This requires hardware resources and people with skills to manage the
>> database and ideally move towards weekly/daily/minutely updates, and
>> to generate tiles showing the missing segments. Anyone interested in
>> helping with this?
> I started playing with this last night and ended up with the Chicago area
> metro extract from Mike and the Cook County TIGER roads data as layers in
> QGIS. Next up is to play with various queries to find missing roads in OSM.
> I like the idea of buffer and joining as a start and will probably move
> over to PostGIS to do that.
> I'd love to hear from anyone else that has ideas.
As for scaling, it may be preferable to process counties on request, it is
a pretty expensive operation especially when you get into the details and
start realizing all the subqueries you'll need to get it right. The added
advantage is that it is easier to keep track of the counties that are
already looked at, at the expense of some overhead coding.
Queries to find missing roads entirely based on intersection are not likely
to be very successful for two reasons: 1) TIGER spatial accuracy is bad
enough to generate a lot of false positives and 2) a buffered OSM road will
likely intersect more than one TIGER road, even if the actual road does not
exist in JOSM.
What you could do is buffer all OSM roads and filter those TIGER roads that
are more than x % outside of the resulting polygon. Those may be candidates
for missing roads. Another interesting case for a microtasking platform by
the way, to have people who are not necessarily experienced OSM editors
identify the valid missing roads from the resulting dataset.
martijn van exel
1109 1st ave #2
salt lake city, ut 84103
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Talk-us