[OSM-legal-talk] [Talk-us] press from SOTM US

Alex Barth alex at mapbox.com
Thu Oct 25 02:46:58 GMT 2012


On Oct 23, 2012, at 5:44 AM, Frederik Ramm <frederik at remote.org> wrote:

> Hi,
> 
> On 10/23/12 01:24, Alex Barth wrote:
>> Say we clarified that geocoding a dataset with an OSM powered
>> geocoder (e. g. Nominatim) does not extend the ODbL license to such a
>> dataset. This clarification would not apply to the dataset that
>> actually powered the geo coder. So if I went and gathered improvement
>> suggestions of my users ("move the marker to the right position on
>> the map") and I added them into that OSM dataset that powers the
>> geocoder, this OSM dataset would still constitute a derivative DB.
> 
> That much is clear + agreed, but there will likely be a good business case for gathering improvement suggestions from your users and *not* adding them to the OSM dataset that powers the geocoder.
> 
> As I tried to say with the geocoder.ca example, even without actively involving the user - just recording what they typed - you can collect valuable information and improve your own geocoding database without necessarily adding the results to OSM; your collecting that information, however, can only start once you have a geocoder that basically works, i.e. OSM. So, for your particular use, the OSM geocoding capability is clearly essential - you could not start from nothing.

If I understand this case right, it wouldn't be governed by the ODbL at all either way, strict interpretation of substantial or not. If I record my user's corrections to my OSM powered geocoder's output and if I never add these corrections to my geocoder's OSM database, there wouldn't ever be a derivative database, hence the ODbL's share alike clause couldn't even begin to come into effect. As long as I don't combine datasets into the same database, I'm fine. In fact, there are such geocoders, for instance Carmen doesn't need to mix datasets in order to leverage them all for a single query https://github.com/mapbox/carmen.

Now, the story is different for the dataset I'm geocoding, of course the result of a say, OSM+PD powered geocoder would continue to have the grey area questions that I'd love to clear up...

> 
> It is *possible* that something is essential and insubstantial at the same time, but it does sound a bit strange.
> 
> Another question that we could ask to enlighten us is: What do commercial geocoding providers usually allow you to do once you have paid them? When you geocode a dataset with TomTom data and you pay them for that, do TomTom then still claim any rights about your resulting database, or do they say, like you sketched above, that "their license does not extend to the geocoded dataset"?

Interesting question. Not sure what the pricing is, but I'm sure you can get commercial datasets for geocoding with no strings attached. Right now you can't get a no-strings attached geocoding guarantee with OSM at all and that's what I'm worried about.

> 
> I think that nobody in OSM actually expects users to share their customer or patient record just because it has been geocoded with OSM. But there is likely an expectation that any data someone might have that can help to *improve* geocoding should be shared back.
> 
> During the license change discussion, my position was often this: Instead of trying to codify everything in watertight legalese, let's just make the data PD and write a human-readable "moral contract" that lists things we *expect* users to do, but don't *enforce*. - Maybe the same can be done with geocoding; we could agree on making no legal request for opening up any geodata, but at the same time make it very clear that we would consider it shameful for someone to exploit this in order to build any kind of "improved geocoding" without sharing back.

I would welcome such an approach. Not sure about the shaming part, I like encouragement better… In my life as an open source contributor I have never seen good contributions coming from enforced rules, but from inspired and driven community members - individuals, orgs, companies. Share alike has us have these mind breaking conversations :)

> 
> (In today's world, a press release that goes "The OSMF foundation regrets to see company X violating OSM's moral code by doing Y" can be more powerful than legal threats anyway.)
> 
>> In my mind there's much to be gained by giving
>> better incentives to contribute to OSM by clarifying the geocoding
>> situation and little to be lost by allowing narrow extracts of OSM.
> 
> The whole share-alike thing is about striking a balance between exposure (surely a public domain release would give us maximum exposure) and incentive to contribute back (the more open your license, the less you force people to contribute back).

Yes, yes, yes. And again, the contributions we're looking for are improvements to the data. I truly believe that if we manage to clarify the geocoding situation that we'll create a very important incentive for improving the map.

> 
> Addresses seem to be a valuable part of OSM data. Could an extract of potentially millions of addresses really be "narrow"?

At the same time addresses are lagging in comparison to other data. Wouldn't we create a strong incentive for adding more addresses by clarifying geocoding?

> 
>> I
>> believe we can do this within the letter of the ODbL and within the
>> spirit of why the ODbL was adopted.
> 
> I think that to preserve the spirit we would have to find a way of saying "you can use our data to geocode your patient database and we don't want any of your patient data in return" while at the same time saying "if you devise anything to improve this geocoding on your side, or have additional data that can help improving the geocoding, then that falls under ODbL". I don't know if this can be done within the wriggle room that ODbL affords us but it is worth a try.

Right, so your OSM-derived database that powers your geocoder is governed by ODbL (just like any other derivative db's), but the database you're geocoding isn't.  

> 
>> BTW, I don't want to know how many people out there have used
>> Nominatim for geocoding without having any idea...
> 
> Any JSON or XML result from Nominatim contains the explicit "Data (c) OpenStreetMap contributors, ODbL 1.0" and even the URL of the copyright page.

Heh, should have looked better :)

> I'm sure people are using that without having any idea - but that's the same everywhere. I have personally spoken to lots of people who casually said things like "then we ran that through Google's geocoding..."; OSM has even been offered, on several occasions, "donated" POI data where it later turned out that they had not surveyed the POI locations but just ran their addresses by a commercial geocoder and disregarded the license restrictions.
> 
> Bye
> Frederik
> 
> -- 
> Frederik Ramm  ##  eMail frederik at remote.org  ##  N49°00'09" E008°23'33"
> 
> _______________________________________________
> legal-talk mailing list
> legal-talk at openstreetmap.org
> http://lists.openstreetmap.org/listinfo/legal-talk

Alex Barth
http://twitter.com/lxbarth
tel (+1) 202 250 3633







More information about the legal-talk mailing list