[Talk-us] Tiger 2007 data

Dave Hansen dave at sr71.net
Mon Oct 27 17:02:38 GMT 2008


On Sat, 2008-10-25 at 10:37 +1100, Nick Hocking wrote:
> I'm firmly convinced that automatic uploads should only go into areas
> where there are NO user edited nodes or ways. Other updates need to be
> done manully to avoid data corruption.

You have absolutely shown a number of cases where there was no merging
and the TIGER data was simply splatted over existing data.  This is
certainly one of the downsides of the approach that was tried before.

I uploaded Oregon first because I had already talked to all the mappers
in my own state.  I then proceeded to upload all the states that were
completely empty.  After that, I used this map:

	http://ted.mielczarek.org/code/osm/counties/

and uploaded only counties that were virtually empty.  Interestingly,
people started contacting me pretty quickly saying "you missed my
county!"  That's because the areas with the most data were also the
places with the most active mappers!  Almost all the prolific mappers
knew that they could never compete with the pure amount of TIGER data
and went to the heroic effort of merging their existing work with it.

In the end, I think there was only a single county (out of ~3000) in the
US that didn't get TIGER data in one way shape or form, and I gave
people plenty of time to decline.

So, I completely disagree that the merging can and should be done
manually.  There's simply too much data.  It's also not feasible to
blacklist every county that has *ever* had a single edit.

But, I don't want to "corrupt" anything.  Just as before, I'm going to
let all the decisions be made by the local mappers.  If you're
concerned, please stay on this list because I'll always announce things
here (at least).

-- Dave





More information about the Talk-us mailing list