<html>

  <head>

    <meta http-equiv="content-type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p><font face="Helvetica, Arial, sans-serif">Niels, et al,</font></p>

    <p><font face="Helvetica, Arial, sans-serif">The Central Limit

        Theorem does not predict that answers to a questionnaire will

        follow a normal distribution.  Rather,<br>

      </font></p>

    <p><font face="Helvetica, Arial, sans-serif">> when independent

        random variables are added, their properly normalized sum tends

        toward a normal distribution (informally a bell curve) even if

        the original variables themselves are not normally distributed.

        The theorem is a key concept in probability theory because it

        implies that probabilistic and statistical methods that work for

        normal distributions can be applicable to many problems

        involving other types of distributions.</font></p>

    <p><font face="Helvetica, Arial, sans-serif">> For example,

        suppose that a sample is obtained containing many observations,

        each observation being randomly generated in a way that does not

        depend on the values of the other observations, and that the

        arithmetic mean of the observed values is computed. If this

        procedure is performed many times, the central limit theorem

        says that the probability distribution of the average will

        closely approximate a normal distribution.[1]</font></p>

    <p><font face="Helvetica, Arial, sans-serif">What this means in

        practice is that larger samples will typically yield more

        accurate estimates of the parameters one would obtain by

        conducting a census of the population.  Statisticians consider

        any sample exceeding 1,067 observations a "large sample".  As of

        16:00 hours 24 January, the survey had collected 1575 full

        responses (i.e., including demographic data) and 2127 total

        responses (i.e., responses lacking some or all demographic

        data).  Statisticians will call that a "large sample", and it

        continues to grow.<br>

      </font></p>

    <p><font face="Helvetica, Arial, sans-serif">In an ideal world we

        would a) identify all members of the OSM global community and b)

        conduct a census of them.  This is impossible.  We thus conduct

        a survey, make it as large as possible, and advertise it widely

        so as to reach as many corners of the OSM community as

        possible.  We have translated it into 15 languages from the

        original English.  We are promoting it through all manner of

        communications channels.  If we are missing something, please

        suggest how to address it.  <br>

      </font></p>

    <p><font face="Helvetica, Arial, sans-serif">The OSM community is

        clearly not normally distributed when compared to the global

        population.  Since activity in the OSMverse requires computer

        literacy, the OSM community is more educated and somewhat

        wealthier than average (virtually all if not all OSMers have

        access to a computer and the internet), and based on Pascal

        Neis's OSMstat data, it is heavily biased toward Europe.[2] 

        Nonetheless, the Central Limit Theorem assures us that a large

        sample of the OSM community, if not restricted to a particular

        segment of the population, can provide estimates that should be

        close to the population's actual statistics, if they could be

        collected.<br>

      </font></p>

    <p><font face="Helvetica, Arial, sans-serif">[1]

        <a class="moz-txt-link-freetext" href="https://en.wikipedia.org/wiki/Central_limit_theorem">https://en.wikipedia.org/wiki/Central_limit_theorem</a><br>

        [2] <a class="moz-txt-link-freetext" href="https://osmstats.neis-one.org/?item=countries">https://osmstats.neis-one.org/?item=countries</a><br>

      </font></p>

    <p><font face="Helvetica, Arial, sans-serif"><br>

      </font></p>

    <p><font face="Helvetica, Arial, sans-serif"></font>

      <blockquote type="cite">From: Niels Elgaard Larsen

        <a class="moz-txt-link-rfc2396E" href="mailto:elgaard@agol.dk"><elgaard@agol.dk></a><br>

        To: <a class="moz-txt-link-abbreviated" href="mailto:talk@openstreetmap.org">talk@openstreetmap.org</a><br>

        Subject: Re: [OSM-talk] Report on the OSMF 2021 Survey after One

        Week<br>

        Message-ID: <a class="moz-txt-link-rfc2396E" href="mailto:fc1ef304-c8ad-8a46-b3a1-16e3c8613b24@agol.dk"><fc1ef304-c8ad-8a46-b3a1-16e3c8613b24@agol.dk></a><br>

        Content-Type: text/plain; charset=utf-8; format=flowed<br>

        <br>

        Allan Mustard:<br>

        <br>

        > Third, while indeed about one quarter of respondents

        decline to provide the optional <br>

        > demographic data, we are on track to collect enough "full"

        surveys (i.e., including <br>

        > demographic data) to surpass a 3% confidence interval at

        the 99% confidence level.[5]<br>

        <br>

        What makes you believe that the answers follow a normal

        distribution?<br>

        It will be interesting to see if it looks like a normal

        distribution.</blockquote>

      <br>

    </p>

    <div class="moz-signature">-- <br>

      -------<br>

      <i>Allan Mustard, Chairperson</i><br>

      <i>Board of Directors</i><br>

      <i>OpenStreetMap Foundation</i></div>

  </body>

</html>