[Talk-us] Using TIGER to find missing road segments in OSM after license change

Josh Doe josh at joshdoe.com
Wed Mar 28 16:34:12 BST 2012


On Wed, Mar 28, 2012 at 11:00 AM, Martijn van Exel <m at rtijn.org> wrote:
> Hi all,
>
> On Wed, Mar 28, 2012 at 6:48 AM, Ian Dees <ian.dees at gmail.com> wrote:
>>
>> On Wed, Mar 28, 2012 at 5:47 AM, Josh Doe <josh at joshdoe.com> wrote:
>>>
>>> Hardest part will then be scaling this up to all 3140 counties.
>>
>> I'd love to hear from anyone else that has ideas.
>>
>
> As for scaling, it may be preferable to process counties on request, it is a
> pretty expensive operation especially when you get into the details and
> start realizing all the subqueries you'll need to get it right. The added
> advantage is that it is easier to keep track of the counties that are
> already looked at, at the expense of some overhead coding.

We'd certainly start there, and see how it goes.

> Queries to find missing roads entirely based on intersection are not likely
> to be very successful for two reasons: 1) TIGER spatial accuracy is bad
> enough to generate a lot of false positives and 2) a buffered OSM road will
> likely intersect more than one TIGER road, even if the actual road does not
> exist in JOSM.

True, but this will vary by region. My area (Fairfax County, VA) has
very high spatial accuracy since it comes from the local high quality
county database.

> What you could do is buffer all OSM roads and filter those TIGER roads that
> are more than x % outside of the resulting polygon. Those may be candidates
> for missing roads. Another interesting case for a microtasking platform by
> the way, to have people who are not necessarily experienced OSM editors
> identify the valid missing roads from the resulting dataset.

Yes, we'd definitely need to do use a %, or at least shorten each way
before buffering so as not to get all roads at intersections.

-Josh



More information about the Talk-us mailing list