[OSM-legal-talk] Proposed "Metadata"-Guideline

Mr. Stace D Maples stacemaples at stanford.edu
Fri Oct 9 15:49:26 UTC 2015


Hello all, I’m new to this list, but wanted to chime in that I am happy to see this thread of discussion, here. I’ve been supporting research and teaching with geospatial tools for about 15 years (first, at Yale, now at Stanford) and I’d like to chime in from that perspective, since the industry angle is well represented in the discussion over the last few years.

The ODbL ShareAlike issue first came to my attention at the DC SotM a couple of years ago, which I was attending in support of a proposal I had put forth at Yale Medical Library to built a geocoding platform on Open Source software and Public Domain data that could be used to geocode research data that contains personal health information and is therefore subject to the restriction in handling and use imposed by HIPAA, the Health Insurance Portability and Accountability Act . That, I think, was the meeting where the sharealike issue really came to a head, and I went home convinced that I would not be successful in getting the licensing terms for OSM past an Internal Review Board, who would squash ANY development of a project that could introduce conflicts with HIPAA restriction on personal health information. Ironic, in fact, that the very answer I had come up with to solve the problem of using non-transparent, proprietary software and data, invoked even more serious conflicts with the law that was the impetus for the project, in the first place.

Now, two years on, I have seen the lack of clarity in the ODbL and the OSM guidance on the subject create a chilling effect on the use of OSM data for academic research, particularly in the fields of public health and medicine. In several instances, researchers who could have benefitted greatly from the use of OSM and who could have done great good in the world with it, have declined to use it because of the lack of clarity (in the part of ODbL and OSM) in defining just what constitutes a derivative database.

In two instances now, one the above cited project, and another project to create a mobile application to help doctors in remote areas of Bangladesh track and treat cholera outbreaks, I have seen OSM cause IRB problems and eventually seen it stripped from the projects. Additionally, I have seen other researchers decline to use OSM data due to privacy issues (in the case of a researcher who was hesitant to use OSM to geocode data received from confidential informants in Damascus, for obvious reasons), as well as the more benign (but no less problematic, in academia) issue of publishing embargoes on research. Researchers at higher ed institutions are required to publish. Publishing is a competitive game, and many researchers are hesitant to invest their time in using research data that may or may not require them to share their own research out before they have had a chance to publish.

I am of the opinion that the use of the ShareAlike license does little to protect OSM from use by people and organizations that are not willing to contribute back to OSM, which I suspect is the idea. What I DO see it doing is causing a chilling effect on the use of the data for legitimate research purposes, which I can’t imagine that the vast majority of OSM contributors would be opposed to. So, ideally, OpenStreetMap should actually be open for any use, but barring the dropping of sharealike, there certainly needs to be a great deal of clarification and specificity in how the clause is applied. Certainly, clearly defined examples of use cases and the parameters of the application of sharealike, would be helpful. For instance, if research using OSM is subject to sharealike, when must the data be released? Immediately, eventually, after a 3 year publishing embargo (that’s our default publishing embargo on the Stanford Digital Repository for research data)? How do you resolve a conflict between HIPAA and ODbL, when personal health information CANNOT be released, under any circumstances? Is research using OSM data in Public Health and Medicine simply off limits?

Again, I’m happy to see an active discussion of these issues beginning, here, and welcome any questions the list members might have about OSM/ODbL license implications outside of commercial applications.

One other question, and I’m just curious, not trying to start a flame war. Isn’t some of the data in OSM from public domain datasets? If so, what is the OSM rationale for placing a more restrictive licensing model on that data?

Best to all, hope to hear from you soon.

In F,L&T,
Stace Maples
Geospatial Manager
Stanford Geospatial Center
@mapninja
staceymaples at G+
Skype: stacey.maples
214.641.0920
Find GeoData: https://earthworks.stanford.edu<https://earthworks.stanford.edu/>
Get GeoHelp: https://gis.stanford.edu/

"I have a map of the United States... actual size.
It says, "Scale: 1 mile = 1 mile."
I spent last summer folding it."
-Steven Wright-

[OSM-legal-talk] Proposed "Metadata"-Guideline
Michael Steffen michael at mapbox.com <mailto:legal-talk%40openstreetmap.org?Subject=Re%3A%20%5BOSM-legal-talk%5D%20Proposed%20%22Metadata%22-Guideline&In-Reply-To=%3CCAB6DAzwevq%3DgW2r3syg_pLmtcRG-%2BaVQxLHFuqtxCidNyRt8xA%40mail.gmail.com%3E>
Fri Oct 2 16:46:41 UTC 2015

  *   Next message: [OSM-legal-talk] Proposed "Metadata"-Guideline<https://lists.openstreetmap.org/pipermail/legal-talk/2015-October/008265.html>
  *   Messages sorted by: [ date ]<https://lists.openstreetmap.org/pipermail/legal-talk/2015-October/date.html#8264> [ thread ]<https://lists.openstreetmap.org/pipermail/legal-talk/2015-October/thread.html#8264> [ subject ]<https://lists.openstreetmap.org/pipermail/legal-talk/2015-October/subject.html#8264> [ author ]<https://lists.openstreetmap.org/pipermail/legal-talk/2015-October/author.html#8264>

________________________________

Simon et al -

First of all, hello! I started a few months ago as in-house counsel at
Mapbox. I come from the U.S. gov (FCC) where I did a lot of work, among
other things, on opening FCC geodata to the public. I've had to focus on
other things in my first few months, but am looking forward to finally
being able to turn more of my attention to working with this group.

As Tom mentioned, several of us at Mapbox have been digging into the
specifics of the Metadata guideline and I think something like this could
be useful in clarifying and opening up important use cases. (This is true
independently of the broader threads going on around geocoding.)

I've offered specific suggestions below, with explanatory notes.

Thanks for pushing this along Simon (and others),

-Michael


-----

> = Metadata Guideline =

> == Background ==

> Many users of OpenStreetMap data are concerned about the share alike
> implications of the ODbL when using OpenStreetMap derived data together
with
> proprietary data, even with such data that is clearly outside the scope of
> the OpenStreetMap project.

> This guideline attempts to better define usage of OpenStreetMap data that
> the OSMF and the community views as acceptable without invoking the share
> alike clauses of the ODbL. This does not imply, as with all community
> guidelines, that this is the only legal way to do so, just legal usage we
> consider in line with the goals of the project.

> The ODbL defines two ways OpenStreetMap data can be utilized with third
> party data: as part of a “Collective Database” or as a “Derivative
> Database”.

> Use in a “Collective Database” does not invoke share alike, the ODbL
> requires that the individual component databases of the collective
database
> are “independent” however does not further define what that means.

> ~~While it would seem to be simple to define “independent” as having no
> ~~reference to OpenStreetMap data, every geographic dataset can be linked
> ~~just by virtue of its location information and further it is a trivial
> ~~exercise to link two datasets isolating OpenStreetMap derived data and
> ~~references to the other dataset in just one of them, so that is likely
not
> ~~a useful criteria.~~

I'd recommend deleting the paragraph above: it's unnecessary
and a bit confusing--the first two grafs amply explain why the guidance
is needed.

> == The Guideline ==

> A database containing one or more datasets derived from OpenStreetMap data
> and other sources is considered an ODbL collective database if one of the
> following conditions are fulfilled by the database elements from other
> sources:

> * the elements do not contain references to OpenStreetMap original or
> derived elements

> * the elements that contain references to OpenStreetMap elements do not
> replace or modify existing attributes or geometry of the referenced
> OpenStreetMap elements.

> For the purpose of this guideline

> * a reference can be a primary or composite database key or any other
method
> of identifying a specific OpenStreetMap derived element.

> * adding additional attributes by means of such a reference is not
> considered modifying the existing attributes of the referenced
> OpenStreetMap element.

> * referring from an OpenStreetMap derived element to an element from
another
> source in the database is considered equivalent to a reference in the
other
> direction.

I'd add an additional bullet akin to the following:

> * technical implementations that are functionally equivalent to a primary
or
> composite key reference but facilitate performance improvements -- for
> example a join of tables by a primary ID for purposes of a production
> database -- are equivalent to a reference.

> == Examples ==

> The following examples will demonstrate this further.

> === Examples of where you DO NOT need to share your non-OpenStreetMap data

> * You collect restaurant reviews and reference the restaurants in your
> database by OSM object id.__[^1]__ ~~(note this is technically
> defective)~~. Your restaurant reviews are not subject to sharealike.

As indicated above, I think it would be clearer to move the technical point
to a footnote, where we'd briefly explain *why* it's technically defective
to use OSM
ID as a database key.

> * You generate traffic data from in-car GPS information and use the
location
> information to identify roads in OSM to weight them differently in your
> routing application.

> ~~=== Examples of where you DO need to share your non-OpenStreetMap data

> ~~* you own a database of restaurant star ratings, you publish a product
> ~~that provides one dataset that uses ratings from OSM when you don’t have
> ~~it in your database and otherwise your data. The data that you publish
> ~~is subject to sharealike. Note: if you don’t use the relevant OSM
> ~~attributes and just your data, your data is not subject to sharealike as
> ~~defined in the “Horizontal Layers” guideline. Note this is a
> ~~hypothetical use case and not an actual one.~~

I recommend striking the paragraph above: This statement doesn't clearly
flow
from the ODbL under all circumstances. That would also be in line with the
the
stated intent in the opening of the guideline: describe "usage of
OpenStreetMap
data that the OSMF and the community views as acceptable without invoking
the
share alike clauses of the ODbL" without implying "that this is the only
legal
way to do so"

I'd also add something like the following note to the end of the guideline,
as described above:

> __[^1]OSM IDs are not stable identifiers, so this is not a recommended
> method of linking other data to OSM extracts.__

On Tue, Sep 22, 2015 at 1:55 PM, Simon Poole <simon at poole.ch<https://lists.openstreetmap.org/listinfo/legal-talk>> wrote:

> I've added a clarification to the example in question as it is causing
> some contention.
>
> Simon
>
> Am 22.09.2015 um 22:39 schrieb Simon Poole:
> >
> > Am 22.09.2015 um 22:14 schrieb alyssa wright:
> >> What does this mean? "uses ratings from OSM "
> >>
> > Again: it is just a hypothetical example.
> >
> > Obviously using a real life use case and declaring that as
> > non-conformant or whatever in a not yet agreed to guideline would not be
> > sensible (just imagine the outrage).
> >
> > Not to mention the ability of the OSM community to dig out many years
> > stale and obviously out of date wiki pages and to pretend that they are
> > meaningful implies that anything that we put in writing is going to be
> > quoted for the next couple of decades regardless of what guideline we
> > end up with eventually.
> >
> > Simon
> >
>
>
>
> _______________________________________________
> legal-talk mailing list
> legal-talk at openstreetmap.org<https://lists.openstreetmap.org/listinfo/legal-talk>
> https://lists.openstreetmap.org/listinfo/legal-talk
>
>


--
This is a private email.  Please check with me before forwarding, as it may
include information that's confidential or protected by the attorney-client
privilege.  If you feel like this email was sent to you by mistake, please
delete it and let me know. Thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/legal-talk/attachments/20151002/6ab71934/attachment.html>


________________________________

  *   Next message: [OSM-legal-talk] Proposed "Metadata"-Guideline<https://lists.openstreetmap.org/pipermail/legal-talk/2015-October/008265.html>
  *   Messages sorted by: [ date ]<https://lists.openstreetmap.org/pipermail/legal-talk/2015-October/date.html#8264> [ thread ]<https://lists.openstreetmap.org/pipermail/legal-talk/2015-October/thread.html#8264> [ subject ]<https://lists.openstreetmap.org/pipermail/legal-talk/2015-October/subject.html#8264> [ author ]<https://lists.openstreetmap.org/pipermail/legal-talk/2015-October/author.html#8264>

________________________________
More information about the legal-talk mailing list<https://lists.openstreetmap.org/listinfo/legal-talk>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/legal-talk/attachments/20151009/874931a7/attachment-0001.html>


More information about the legal-talk mailing list