[OSM-dev] GDPR implementation on planet.osm.org

Michael Reichert osm-ml at michreichert.de
Sat Jun 23 14:19:13 UTC 2018

Hi Roland,

Am 2018-06-20 um 20:16 schrieb Roland Olbricht:
> On the technical side, things are even worse. The elephant in the room
> is OAuth. OAuth is built on in particular the assumptions that
> - the consumer ("the website") acts stateful
> - sessions are relatively long-lived, i.e. some seconds to some hours
> - the identity provider has the cross-origin assets
> All three are not true for Overpass API which means that I have to work
> around OAuth or significantly mess with it.
> For example, implementing to have sessions on Overpass API will require
> to develop a full-fledged security system to deal with the hundres of
> potential modes of attacks on session based systms. Even if that works,
> the median runtime for a request on Overpass API is well below a second,
> and just the roundtrip times for the OAuth threesome communication sum
> up to more. We have not even started to talk about the plethora of error
> messages that need to be formulated, explained, and implemented.

You do not need the full roundtrip for each request. I have implemented
the authentication of the protected part of Geofabrik's download service
(https://osm-internal.download.geofabrik.de/). Its source code can be
found at https://github.com/geofabrik/sendfile_osm_oauth_protector

1. If a user requests a protected resource https://HOST/PATH for the
first time, he will receive the landing page containing a link to

2. If he follows this link, the web application will check if he
attached a cookie to his request. If no cookie was attached, the
application will retrieve a temporary request token from
https://www.openstreetmap.org/oauth/request_token and reply with a
redirect (302 Found) to

3. The browser will call the URL in the Location header of the response
of item 2. If the user is already logged in into OSM, he will be asked
to grant a permission to the application "Geofabrik Downloads".
Otherwise, he has to log in first.

4. If the user grants the permission and clicks on "Grant permissions",
his browser fill send a HTTP POST request to
https://www.openstreetmap.org/oauth/authorize. The OSM website will
respond with code 302 and pointing him to

5. The user calls (HTTP GET)
https://HOST/PATH?oauth_token_secret_encr=Y_ENCRYPTED&oauth_token=X. The
web application of the download server recognizes the URL parameters
oauth_token and oauth_token_secret. The web application retrieves a
permanent OAuth access token from the OSM API by calling
https://www.openstreetmap.org/access_token. If that works, it is able to
call https://api.openstreetmap.org/api/0.6/user/details (with the
permanent access token in the HTTP Authentication header). If this
request does not fail, the access token is valid and the web application
has ensured that the client has a valid OSM account. The web application
sets a cookie as described in
and responds with the requested resource.

The cookies contains the login status (unencrypted, unsigned), the name
of the key set which was used by the server to encrypt and sign the
cookie and a encrypted and signed part consisting of the access token,
the access token and the expiry date (48 hours).

6. The client sends this cookie with all future requests to the server.
The server decrypts the cookie and checks the signature. If it is ok and
the expiry date has not passed yet, the request is answered immediately
without further OAuth round trips. If the expiry date has passed, the
server responds with a redirect (code 302) as described in item 2.

Our solution does not need any session management on our side. The
session IDs are stored in the cookies. They are encrypted and signed to
prevent clients to manipulate them (or the expiry date).

It would be possible to avoid the round trip every 48 hours if the
server calls https://api.openstreetmap.org/api/0.6/user/details again
using the permanent access token (it's in the cookie). This means, you
could revalidate the validity of the OSM account every 48 hours.
However, this feature bears a security risk. It is implemented in my
tool but we at Geofabrik decided not to use it. If a user accidentally
publishes his cookie on GitHub (e.g. forgotten to remove it from the
invokation of curl), someone else could use it forever (until the access
token is revoked by the user which usually does not happen). Instead, we
require the user to re-enter his OSM account credentials every 48 hours
and require such malicious users to publish their OSM account credentials.

> On top of that, the OAuth idea means that each and every sequence of
> user data access will trigger an event on the central OSM OAuth server.
> This is quite Orwellian. Even if you do not store that information, your
> friendly agency of choice will do so on the line that connects the server.

Only the first access to a service providing data triggers an event on
the central OSM OAuth server. If the service has its own user management
and does not recheck the permission that often. The Geofabrik solution
does not have its own user/session database and therefore relies on many
re-authentications and re-authorisations.

> Additionally, if you monitor "independend processors" so closely, it is
> questionable whether they are not seen as disguised contractors by a judge.
> I can live with the requirement to do OAuth do download diffs, although
> it is substantial effort on its own.

If the Geofabrik solution is chosen by the OSMF, you can still download
the diffs which are protected by OAuth using scripts. We have published
a Python script called oauth_cookie_client.py. It needs your OSM
username and password and does all other steps to retrieve a cookie
automatically (like JOSM's full automatised).


> But, please, if you want Overpass API in the future then please do not
> bind third party data delivery to OSM user accounts.

I think that you can continue delivery of personal data by Overpass API
as long as you ensure that all recipients are aware what they may do
with the data and what not.

Best regards


Per E-Mail kommuniziere ich bevorzugt GPG-verschl├╝sselt. (Mailinglisten
I prefer GPG encryption of emails. (does not apply on mailing lists)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20180623/c9182d1f/attachment.sig>

More information about the dev mailing list