[OSM-talk] Scientific paper on "Information Seeding"

Mon Jul 9 00:21:46 UTC 2018

On Mon, Oct 9, 2017 at 2:10 PM, Frederik Ramm <frederik at remote.org> wrote:

> Hi,
> today I was pointed to a recent, open-access scientific paper called
> "Information Seeding and Knowledge Production in Online Communities:
> Evidence from OpenStreetMap". This open-access paper is available here
> https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3044581
> In the context of armchair mapping, but especially of data imports (and
> recently, machine-generated OSM data) there's always been the discussion
> between those who say "careful, too much importing will hurt the growth
> of a local community", and others who say "this import is going to
> kick-start a local community, let's do it!"

Honestly Frederik, you point to a study and say that it is all scientific.
Furthermore, you act like you just came across the study when in fact you
have already pushed to the mailing lists on two other occasions.[1][2] This
also shows that you have failed to properly check the research before
pushing the link once again.  The study is all scientific sounding yet the
very heart of the study is based on the Modifiable Areal Unit Problem.[3]
Quoting the author "Note that TIGER information was incorporated for 3,093
counties within the US; the state of Massachusetts was excluded because
better quality information
was available from the state government.12 I will restrict my analysis to
these 3,093 counties."  The author picked any data, ignoring the size of
the population or other social and economic factors to make his point.
Thankfully the US Census Bureau uses "Block Groups (BGs) are statistical
divisions of census tracts, are generally defined to contain between 600
and 3,000 people, and are used to present data and control block
numbering."[4]   The whole foundation of the "INFORMATION SEEDING AND
study is based on a geographic division that the
_Census_department_would_not_use_.  The Census Bureau uses their block and
tract configuration for a number of federal programs.

Based on local knowledge you shouldn't even be pushing this study.  This
appears to be what happened.  A smart person takes a plane trip to the east
coast of the US.  That person obtains a degree at MIT and is a
Post-Doctoral Fellow--egg head kind-of smart.  That same person that
authored this study now takes a plane trip to UC Berkeley in California for
a job that the person landed. These two plane trips have led to his fatal
analysis that all counties and the rest of the US must look like MIT and UC
Berkeley and have the same high density as those two cities.

Let's take the Arizona county I map out of, Maricopa County[5], the Arizona
county just north of me, Yavapai County [6], and Switzerland[7] to see the
basic flaws in this research.  5/8th of Switzerland fits in my county
alone.  Adding Yavapai County's size to Maricopa County's size we now have
all of Switzerland covered and the combined population of Maricopa County
and Yavapai County adds only a 228,168 increase to Maricopa County's
population.  The study says that both Maricopa and Yavapai county should
perform the same.  Following this I have always heard from European's that
the American's should perform the same just like how Europe mapped.  The
size of two counties that are part of Arizona swallows up the size of
Switzerland begins to show why the US has a lower mapper density than
Europe does.

Let's compare Germany[8], the state of Montana[9] and the United
States[10].  We see that the size of Montana matches the size of Germany.
Yet, we see the population density is roughly 82 million people in Germany
to 1 million people in Montana.  You see there is nothing special to the
vaunted Germany Pub Meetup as a way to map.  You have the natural density
to make it happen. Moreover, now you feel the German experience should be
the same for the rest of the world and that Montana can have the same
mapping success as Germany.  In Germany mappers are a dime a dozen.   Oh
but wait!  Let's take Germany's population density and see what the US
population would have to be to have the same mapping success.
US Square Milles 3,796,742 / Germany Square Milles 137,903 = It takes
27.53197537399476 Germany's to fit into the US.
This means that to match the German population the US would need
27.5319753739947 * 82,800,000 = 62,279,647,560.966766 people where the
current population is 325,719,178.

The value of travel and education is that a person's understanding of world
is expanded.  That person's view is expanded to understand other human
beings live and work in diverse places. The failure here is that instead of
understanding that the world is different; instead of understanding that
not every one has high speed internet; instead of understanding that not
every one has the same leisure time available to map; the same tired
rhetoric is repeated over and over again that everyone should be able to
craft map their local space has overshadowed the obvious need to change our
outreach to draw more people into mapping.  The rhetoric also fails to
address how does OSM keep mappers once an area is imported or craft
mapped.  That is the real problem not imports.

Finally, I provide two more items to think about. Ben Discoe[11] keeps an
interesting metric.  Thinking about Ben's data, if the US were to try and
survey every node without an import, then it would take 31.7 years to
generate the same number of nodes created by the TIGER import. 31.7 years
is not a very useful map.  In addition, even though I am out Mapillary[12]
surveying most days,  I have not covered the entire state of Arizona.  I
haven't even covered every major road. That would still not be a very
useful map.  However, I can eat my own tasty dog food and use maps.me for
all my map needs.  It is not perfect.  The map does not have to be in order
to be useful.

Another approach is needed to generate more interest in OpenStreetMap.  It
is not the imports dude.


[1] https://lists.openstreetmap.org/pipermail/talk/2017-October/079116.html

[2] https://lists.openstreetmap.org/pipermail/talk-us/2017-Octob

[3] https://en.wikipedia.org/wiki/Modifiable_areal_unit_problem

[4] https://www.census.gov/geo/reference/gtc/gtc_bg.html

[5] https://en.wikipedia.org/wiki/Maricopa_County,_Arizona
9,224 square miles
4,307,033 population

[6] https://en.wikipedia.org/wiki/Yavapai_County,_Arizona
8,128 square miles
228,168 population

[7] https://en.wikipedia.org/wiki/Switzerland
15,940 square miles
8,401,120 population

[8] https://en.wikipedia.org/wiki/Germany
137,903 square miles
82,800,000 population

[9] https://en.wikipedia.org/wiki/Montana
147,040 square miles
1,050,493 population

[10] https://en.wikipedia.org/wiki/United_States
3,796,742 square miles
325,719,178 population

[11] *https://www.openstreetmap.org/user/bdiscoe/diary/44192


What's remarkable to me, as you can see from the trendlines, is how steady
the rates are. At this rate, all of TIGER won't be cleaned up (or at least
touched) for another 31.7 years (for nodes) or 9.9 years (for ways).

[12] https://www.mapillary.com/app/?lat=33.63299223685526&lng=-11
