[Imports] How good can an import be?
Tyler Ritchie
tyler.ritchie at gmail.com
Tue Apr 5 19:44:15 BST 2011
On Tue, Apr 5, 2011 at 1:58 AM, Andy Allan <gravitystorm at gmail.com> wrote:
> I nearly fell off my chair laughing when you implied that TIGER and
> NHD were good datasets.
>
I think Mike already addressed this reasonably, but I said "good" not
fantastic. Alignment, accuracy and coverage in the areas I watch of TIGER
and NHD are very good. Elsewhere they are not fantastic.
> > It would take decades to get any sort of a meaningful map in the US
> without
> > the TIGER import.
>
> That's a defeatist attitude and not borne out by experiences elsewhere
> in the world.
It _would_ take decades where I am--even if I were actively recruiting more
mappers--were we to rely on on-the-ground mapping. Utilizing high quality
aerials and other datasets it wouldn't take that long. Using only data
collected by people "on the ground holding gps units" it would be a very
slow to complete, low accuracy map. I should have made that clearer in the
original message.
> > Sure, the high population density areas will get mapped
> > well and quickly, but not the low density areas.
>
> As opposed to what happened after the TIGER import, where high
> population density areas haven't seen the same level of mapping
> activity/quality that we'd hoped for?
>
I don't think you have the data to say there is any causation there.
> I'm not arguing against using external datasets to help make the map,
> but the imports that are a direct substitute of things that are
> otherwise easily mappable seem to cause the most problems. Lots of
> people expect that they will somehow "kickstart" the next stages, but
> they seem so far to be a displacement activity more than any genuine
> help.
I think there is more going on here than imports displacing editors. You've
got Americans with sedentary lifestyles, a community with an increasing
barrier to entry, maps that are "good enough" elsewhere (and often even in
OSM), low population density over much of the US and Canada, etc.
On Tue, Apr 5, 2011 at 7:04 AM, Richard Weait <richard at weait.com> wrote:
> As another, single point of comparison of communities, consider local
> OSM groups.
> http://usergroups.openstreetmap.de/
>
> There appears to be a local OSM user group for each
> 100 Million USA-ians.
> And one for each 10 Million Canadians.
> And one for each 2 Million Germans.
>
> The USA got imports early.
> Canada got imports a bit later.
> Imports seem much less popular in Germany.
>
Correlation? Causation? I don't know.
>
So the US and Canada have the same number of groups. This is interesting, I
would be curious again if it had more to do with population density. Much of
Canada's population is in a band in the lower latitudes and on both coasts
where the US has its higher densities along each border (so it's more like a
box) with big breaks, so it might be less likely that the flux of people are
spreading the idea of "groups" around. Germany is just a big ol' population
ball. For instance Canada is ~28x larger than Germany in area while having
2/5th of the population. So just doing groups/population isn't enough,
because you actually need to look at the rate of data exchange between
people. Higher population density areas just transmit ideas better.
> There was no objection to TIGER import when Dave did it. As I recall,
> mappers asked him to import their counties even if it meant removing
> their work first. We did not suspect then, what we suspect now.
>
I'm assuming the suspicion is that imports kill user activity? And who is
the group that suspects that?
> Even if an import is perfectly executed, it is stale data. The import
> data was captured by the source yesterday. Or perhaps even longer ago
> than that. Import it today, and tomorrow it starts going stale.
> Updating it next week, or whenever the source renews the dataset is a
> challenge that we haven't necessarily mastered yet and in the interim
> the data gets staler and staler.
>
Hopefully there would come a time when you wouldn't want to update the data
when the source updated. Updating TIGER is being discussed because the new
TIGER data is better than much of the existing un-edited TIGER data. So that
makes sense to do. If we didn't have TIGER already imported you have a few
hypothetical options for what would exist in its place. 1) many more users
all taking personal ownership in their little territories with fully,
accurately mapped roads (that _may_ be more accurate than the existing TIGER
data); 2) no data at all in those areas where TIGER hasn't been modified; 3)
partially mapped, stale user data 4) some combination of the previous. I
think it's likely that you'd have stale user data and no data.
For the most part, roads (with the exclusion of driveways and subdivisions)
in ruralish areas (the vast majority of area in North America) don't change
that much, when they do change a mapper with a range of 100 miles or so can
usually catch it.
> Now by comparison, imagine that your town has a mapper in every
> neighbourhood. How long would a new store on your main street go
> unmapped? A new subdivision in the next neighbourhood? I think that
> model updates faster and better than waiting for somebody else to
> collect and publish the data, then an import / merge / update to OSM.
>
But your entire premise is that imports exclude mappers. I think if I don't
have to map the hundreds of miles of side streets and alleys in my town I'm
more likely to catch the new 50 unit sub-division and add it to the map.
> I think that local mapper, keeping their area of interest up to date,
> is the great strength of OSM. We see it in Germany. It's a much more
> localized strength in North America.
>
I agree 100%
> And I think imports have limited that strength in North America.
I completely disagree.
But that's how this conversation always plays out. You have those who think
imports are useful, and those who don't. At some point there will be a
moratorium on imports, I'm sure. That point will probably be whenever the
existing data for any given area and type of data is deemed to be close
enough to reality to only suffer from further importing of data. At which
point that spot goes into maintenance mode
But we could still benefit from national park, forest, sanctuary boundaries;
state park, public lands, various trails/access roads that are difficult to
efficiently survey (tree covered, in ravines); NHD or more accurate river
datasets; addresses; building polygons from LIDAR...
That's my wall-o-text for the day,
-Tyler
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/imports/attachments/20110405/c9e130bd/attachment-0001.html>
More information about the Imports
mailing list