[OSM-legal-talk] seeking understanding of usage of geocoding and POI

Frederik Ramm frederik at remote.org
Fri Jun 10 20:22:41 UTC 2016


   you have a lot of questions and if you want to base business
decisions on it, you should really, really discuss that with a lawyer.
What you get on this list is layman opinions. I am happy to offer you
mine but I must stress that this is purely my private opinion - I am
currently a member of the OSMF board of directors but that doesn't make
my opinion any more right.

On 06/10/2016 08:53 PM, Karel Mikan wrote:
> *Q1.) *if we just use the map tiles (CC-BY-SA) and display purely our
> OWN POI data on it, we do not trigger share alike for our own data.


> If I understood correctly the distinction between tiles and the
> remaining data, it will be only ODbL beyond this point.

The ODbL has an effect on the tiles; they are a "produced work" and the
OdbL mandates that tile users be made aware that the underlying data
comes from OSM. That the tiles are CC-BY-SA is a feature of the
particular OSM tile server; if you had your own tile server, you could
choose a different license for the tiles.

> *Q2.) The user inputs point of his visit*
> - as an address text that is then
> - converted to coordinates with Nominatim
> - and saved under a visit number (primary key) in our DB.

This makes the list of coordinates a derived database, and whether or
not it must be shared will depend on whether it is publicly used.

>     *Q2a)* If we save both, coordinates and address, this becomes a
>     _derivative database_, because we, in theory, with enough points,
>     could re-engineer the whole OSM DB(Geocoding Guideline).

I'd say only the coordinate part is derived, not the address.

>     So it must
>     be shared (address, coordinates, visit primary key as well). ODbL
>     4.6.b and 1.0 "collective database" seems to imply that even the
>     visit number must be shared, because the key is not independent by
>     itself (it was generated because of the coordinates)?
>         *Q2a1)* What if we then, based on the user location, save also
>         the OSM POI IDs in the area under the same visit number
>         (different DB)?

The database with the OSM POI IDs is a derived database. Whether or not
it must be shared will depend on whether it is publicly used.

>         We assume that the derivative database continues to apply
>         because of ODbL 4.6.b "additional content". We need to share
>         this DB as well (basically all the information that is saved
>         with this visit- because the starting points were based on OSM
>         geocoding)? 

As a rule of thumb, anything you couldn't have done without OSM is very
likely to be somehow derived.

>     *Q2b - alternative to a)* If we just save coordinates from the
>     address (not the address itself), does it change anything? We now
>     cannot rebuild the geocoding DB anymore, but the coordinates still
>     come from Nominatim. Do we then still have a _derivative DB_

I think so, yes.

>     *Q2c - alternative to a)* What is the situation, if the customer
>     inputs an address and then has an option to confirm that the
>     coordinates are correct? If he confirms without changes, we save
>     only the address (NOT the coordinates from Nominatim). This should
>     be collective DB, as we save no OSM data. (when we display we than
>     call on Nominatim to give us the coordinates to display the address,
>     but we never save the coordinates)?

IMO this is a derived database as well. Imagine the following
hypothetical use case: You employ a thousand monkeys with typewriters
and let them type out addresses. The street names are then geocoded with
Nominatim and human visitors to your web site look at the results and
say yes or no. Of course the database of resulting "yes" addresses is
derived from OSM, because without OSM it would only have been monkey
rubbish ;)

Also note that if you employ an OSM map during the "user verification"
process this is anohter reason why your resulting database is derived
from OSM.

>         So for not having all data falling under ODbL, would a separate
>         DB of addresses and coordinates and corrected coordinates allow
>         us to not share the rest of the visit (in case the visit is
>         above answered as collaborative DB)?

I think that if you had a database of addresses and coordinates and
corrected coordinates and put that under ODbL then you'd be fine; the
rest of the visit data should not fall under ODbL.

But be aware that you seem to equate "fall under ODbL" with "have to
share" here, something that is only true if you use it publicly. If
you're running your platform as a kind of data processing service for
your users in the context of some contract - e.g. producing travel logs
for a corporate fleet - then you're not using the data publicly.

>     *Q2d)* If the customer inputs address and coordinate on the map
>     himself by clicking on the map. that would be pure collective DB (we
>     save nothing from OSM). 

Unless the map is made from OSM data, of course, in which case you'd
again have a derived database. (Rule of thumb: Where would you be
without OSM?)

>     *Q3)* if we were to obscure the starting point for privacy security
>     reasons and randomly add a few mmm to the coordinate, would we still
>     need to share the underlying "real point" (might potentially be
>     answered by Q2 answers if using Nominatim as a start is already a
>     derivative DB)?

You'd always have to share the data that you use publicly. If you add
random noise to the data before using it publicly, then the random-noise
data is the but you have to share; the fact that there might by myriad
purely internal steps that happened before arriving at the data that is
used publicly, doesn't matter.

>     *Q4) *when developing our own tagging, (linked to OSM POI IDs) is
>     that a  _derivative_ because of ODbL 4.4.b (are all tags of POIs a
>     substantial part of the OSM DB and we look at them prior developing
>     our own) and 4.6b (any additional content must be shared)?

I'm a bit unsure here, suggest you review the "horizontal layers"

Frankly, your further questions sound *so* much like you're looking for
a way to share the absolute minimum possible that I'm not comfortable
discussing this further. If keeping data proprietary for financial gain
is part of your business model, you should really just look into working
with proprietary data to start with, rather than trying to create an
"OSM++" that you don't have to share - even *if* you find suitable
loopholes in the license that make this legal.


Frederik Ramm  ##  eMail frederik at remote.org  ##  N49°00'09" E008°23'33"

More information about the legal-talk mailing list