[OSM-talk] Draft updated privacy policy

Simon Poole simon at poole.ch
Sat Jun 27 12:08:11 UTC 2015

Am 18.06.2015 um 18:16 schrieb Greg Troxel:
> Simon Poole <simon at poole.ch> writes:
>> I've produced an updated version of the OSM privacy policy:
>> http://wiki.openstreetmap.org/wiki/Updated_Privacy_Policy (the original
>> resides here: http://wiki.openstreetmap.org/wiki/Privacy_Policy).
> I have a few big-picture comments so I'm sending them to talk at .
> With respect to data obtained from the site, I think that's nominatim
> queries and also the particular areas that are looked at, posssibly
> associated with IP address, and associated with a user if logged in.
> The policy doesn't address if logs are kept by IP address or by
> username, and for how long.  At first glance, I would be in favor of
> limiting log lifetime to 30 days or so, and not backing them up.
> I would for example find a (beyond-admins) heatmap of which locations
> were loaded to be overly invasive if it were more granular than 1km or
> so.
See below.

>     in support of the operation of the services from a technical,
>     security and planning point of view.
> That's fine in theory, but the question is to whom are they
> accessible/disclosed, and under what terms.  It's pretty clear you mean
> by a small subset people working within OSM who agree not to disclose
> anything beyond counts/trends.  That's fine, but it's not what the text
> says - as long as there is a nexus to support of operations, any
> disclosure is within policy.

I'll be adding a clarification on that.
>         as anonymised, summarised data for research and other
>     purposes. Such data may be offered publicly via
>     http://planet.openstreetmap.org or other channels and used by 3rd
>     parties.
> Anonymized and summarized are different and it is very tricky to prevent
> "reidentification" of anonymized data.  So summarized, where the
> individual items no longer appear (queries/month, etc.), I have no issue
> with.  

The intention is that any published data is both summarised and
anonymised (there is no "or" in the sentence). While you are correct
that "anonymised" is a bit of a no-op in that scenario, it is there to
avoid concerns.

>          If individual addresses appear, then there's no summarization,
> and the geo nature of things means that there can be a single address in
> a region, even if there are 10K in the dataset.  What that means, I
> don't know, but it doesn't seem good to publish the list of addresses
> that people looked up.
>         to improve the OpenStreetMap dataset. For example by analysing
>     nominatim queries for missing addresses and postcodes and providing
>     such data to the OSM community.
> It may be reasonable to have on the nomination failure page a "add this
> query to a public list of queries without an answer".  That will both
> avoid people's queries getting published when they didn't want them to,
> and also filter out some typos.   Sort of like a map note by address we
> can't geocode, rather than by coordinates.

Given that it hasn't actually been implemented I can't say for sure, but
I suspect (just as with the tile statistics) we would only publish
addresses for areas with a certain minimum density of requests per
higher level admin entity. Aka we wouldn't be publishing an individual
address for a city or similar.

In practical terms I don't think this is an issue in any case since
individual requests via the UI will be completely drowned out by bulk
geocoding via the API (which is the main reason we want to do this in
the first place).

We are currently peaking at roughly 350 requests per second.

> With real routing, addresses that are frequent from values are likely
> those of OSM users; publishing that seems uncool.
>     No personal information or information that could be linked to an
>     individual will be released to third parties, except as required by law.
> That's pretty strong, given reidentifciation concerns.  So perhaps
> that's "other than as specififed above".

It may need some tweaking wrt addresses.

>     Third party services providing content linked to via or third party
>     JavaScript files utilised by OSMF provided sites are not covered by this
>     policy and you will need to refer to the respective service providers
>     for more information. Examples of such services and content are the HOT
>     and CycleMap map layers on openstreetmap.org and the JavaSctipt
>     frameworks used by help.openstreetmap.org. 
> This is an interesting question.  It would be reasonable for OSMF to
> require that other entities whose content is integrated into
> openstreetmap.org have privacy policies that are consistent with OSMF's
> privacy policy.  Arguably this should be the case, as it requires some
> sophistication to know what's separate.
Outside of the scope of this document.

> The javascript frameworks are an interesting question, and I don't know
> if the question is about if the hosting provider of the js files keeps
> logs, or if the js does bad things.  I see the js code as logically part
> of the site and just happened to be on a CDN to save bandwidth.  But
> again from the user viewpoint this is part of the site.

True, but the intent of this policy is not to fix the Internet.

-naturally- Facebook, google et al use the information they garner from
third parties using their frameworks they are likely one of their most
important sources of information. Luckily openstreetmap.org serves
everything it uses directly, but there are a number of ancillary sites
help.openstreetmap.org and likely a number of the WP based systems that
use code hosted on external sites, the purpose of the clause is to cover


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20150627/c035e2ad/attachment.sig>

More information about the talk mailing list