[Talk-us] Comparing Tiger 2017 dataset with OSM in a automatedway.

Tod Fitch tod at fitchdesign.com
Thu Oct 26 17:50:46 UTC 2017

In the area I now live in California, my first impression looking at this is that the data is garbage. It looks to me that blindly importing would re-introduce TIGER errors that have been successfully removed. Looking at a tiny area in Arizona where my family still has a house, it is not much better.

My opinion is that a direct import of this data should not be done at all.

That said, when helping clean up the chdr reversion mess in Arizona I noticed a number of new subdivisions in the Phoenix metro area where this data set could be useful. But it would need to be done very selectively. For example, there are new roads shown that are not evident in the aerial imagery available to us for OSM. I would not add those unless a ground survey indicated they actually exist. And there are lots that I would characterize as tracks or service roads that have the traditional TIGER residential value.

Is a null length value even valid? Looking at the raw OSM files I see ‘k="name" v=""’ in a number of places.


> On Oct 26, 2017, at 10:00 AM, OSM Volunteer stevea <steveaOSM at softworkers.com> wrote:
> I don't know where all of this is going, and wanted to see for myself, so I downloaded the California file (the largest one of all) and zoomed in on where I live and am most familiar with, Santa Cruz County.  Thank you for providing the ten states worth of translated data for us to take a look.
> What I found was, um, "interesting."  In urban areas, there were indeed a few highway=service, service=alley ways which Bing confirms are either there, mostly there, or "almost there," as in "slightly offset by a meter or three."  However, many of these were also clearly service=driveway instead of alley, a subtle distinction, but a crucial one, in my opinion (driveway implies access=private).  In more rural areas (and by no means is this a hard-and-fast delineation), there were many similar entries, but tree cover (2/3 of my county) made these impossible to distinguish via Bing.  Also, many had a name= tag with an empty value.  I'd rather that simply be no name tag at all, so that should be an easy improvement to make in any future/additional translations.
> There are literally thousands of these in my little county (2nd smallest geographically in the state) and it would take many hours (days) to go through them one by one and Bing compare, which certainly would improve OSM's data here.  (I've done similar tedious visual comparisons for thousands of polygons and TIGER review before, it is a labor of love!)  However, much or even most of these data would need an on-the-ground verification, simply because aerial/satellite data, whether fresh or not, have too much tree cover to allow such armchair mapping.  And, most of these additional data are very likely in highly rural areas which are not only difficult to get to, but are obviously on private property and (as is very typical around here on those) behind gates or "No Trespassing" signs (which I respect).
> So, while I find these a potentially rich source of new and/or better additional data, it is with great tedium and difficulty that they might be vetted/verified in a proper OSM way (cursory, via Bing, and/or fully and correctly, "on the ground").  I'm delighted the exercise to translate them into an easily-usable-by-OSM way has taken place, but it is with a great deal of caution and indeed trepidation that I approach and/or allow any new TIGER dataset "easy entry" into our map.
> In short:  eyes very wide open, slow going (if any going at all) ahead.  If your state is included in the list, and you can zoom into your county or city, I'd be curious to hear what others might say after they take the half-hour or so I did to look and offer similar impressions of these data.
> SteveA
> California
> _______________________________________________
> Talk-us mailing list
> Talk-us at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk-us

More information about the Talk-us mailing list