[OSM-legal-talk] ODbL and publishing source data

Jonathan Harley jon at spiffymap.net
Wed Nov 30 14:18:37 GMT 2011

On 30/11/11 06:02, James Livingston wrote:
> On 30 November 2011 01:03, Jonathan Harley <jon at spiffymap.net 
> <mailto:jon at spiffymap.net>> wrote:
>     On 28/11/11 23:59, James Livingston wrote:
>         Depending on the rendering, it may not be the same. The
>         placements of name text can depend on other data so it's not
>         on top of something else, or POIs can be hidden if there are
>         too many in a given area.
>         In the first case (or combining layers in the browser), the
>         rendering of OSM data cannot depend on the location of your
>         hotels, and the rendering of hotel names can't easily depend
>         on what else is on the map. In the second case (combining data
>         before rendering) collisions can be avoided or the resulting
>         map altered.
>     Yes, but it's only the produced work, the rendering, which is
>     altered. You probably don't need to make changes to the OSM data
>     to acheive this.
>     So the OSM data and other data could remain independent. If they
>     do, then the mechanism for combining (and computer/s on which it
>     happens) is indeed irrelevant.
> While that's true, you are combining the two datasource together prior 
> to rendering.
> Say I created some local postgres database tabled and loaded OSM data 
> into it, and then loaded data from another source into it too. What I 
> have is now a database derived from both OSM and the other source. If 
> I then rendered that data to create a Produced Work, would my combined 
> database not be a Derived Database?

No, not if the data remained unmodified and independent. If data were 
changed - for instance duplicates deleted, or names or co-ordinates 
changed because one source was considered more accurate than the other - 
then yes, that's a derivative database. But if not, it's a collective 
database. Whether they're in the same instance of postgres or not is 
irrelevant, because software is not a database for the purposes of 
database law (at least in the EU). Only the data itself constitutes a 

By way of analogy: suppose I sent you a private email which included a 
license saying if you publicly use my email, you must share with me any 
other emails you combine it with. My email sits in your inbox together 
with other emails, and you can do searches across all of them. If it's a 
unix system, they're probably all in one single file. But have you 
really "combined" my email with your other private emails? My email is 
sitting there unmodified and completely independent of all your other 
private emails; it is not itself combined with them. So no, "storing 
next to" is not "combining". You can safely share a screenshot of your 
mail program without having to send me all your other private emails.

>         This was discussed on legal-talk a few months ago, and my
>         opinion was that it depended on whether you could produce the
>         same output by merging separately-rendered Produced works. If
>         you can get _identical_ output by merging layers on the
>         browser side, then it's okay to the merging on the server
>         side. However if you can't get identical results by merging
>         the rendered output, then you've obviously combined the
>         databases prior to rendering.
>     Not necessarily. For example, the rendering might depend on what
>     order data is rendered. But the data being rendered would remain
>     independent of each other; it may be only the rendered result
>     which varied. And that's a produced work, not a database.
> Can you get the same result by rendering the first dataset (creating a 
> Produced Work), rendering the second dataset (creating another 
> produced work, if it's ODbL too) and then combining the output? If so, 
> they're definitely independent. You can render the second dataset 
> first if you like provided you combine them in the right order.
> If the rendering of the second output depends on the first dataset, 
> the Produced Work created from the second dataset is not independent 
> of of the first dataset.

No, the produced work isn't independent of it, but the datasets are 
still independent of each other, that's my point.

> I guess it's possible the rendering algorithm for the second dataset 
> could use the Produced Work from the first rather than the first 
> dataset directly, which may be okay except that it's arguable whether 
> you are reverse engineering part of the first database.

Yes, I think that point's arguable - if the combined produced work is 
based on two previous renderings from the two datasets, you might be 
able to argue that those renderings are themselves databases, 
particularly if they use vectors - though it's a very murky area, 
because rendering usually involves throwing away lots of information. 
(For example, suppose two roads meet at a T junction, and I render them 
both in the same line style without names; you can no longer tell their 
IDs, tags, or even how many ways in OSM make up that rendered T shape.)

>         Having two instances of say Postgres and having one program
>         that reads both and renders is still creating a derived
>         database, even if it is only in the memory of the rendering
>         program.
>     It might create a derivative database, or it might not; it would
>     depend on the algorithm. If the OSM data remain unmodified, then
>     it could be creating a collective database, which is explicitly
>     not a derivative database.
> I guess that's the a question: if you write a program that reads data 
> from two sources and uses both to produce it's output, are the 
> temporary in-memory data structures considered derived or collective 
> for the purposes of copyright and database right law?

Yes, I've puzzled over that one too. If in-memory structures count as 
derivative databases, then that would be the one you would have to ask 
for under ODbL, which only requires licensees to release the last in any 
chain of derivative databases. It would make that whole side of ODbL 
pretty much impossible to enforce if so.

> The answer probably depends on how the program is implemented, but 
> given that we won't know the implementation, how can we ever determine 
> whether someone's Produced Work requires them to release their 
> database? If we say we can't determine that, aren't we essentially 
> saying that it's impossible to enforce that part of the ODbL?

Even the attribution part of ODbL isn't necessarily easy to enforce - I 
suspect that the more complete OSM gets, the more difficult it will be 
sometimes to tell that a map with no attribution used OSM data. And yes, 
the share-alike part is much harder to enforce that that. It's not so 
gloomy really, though - in some cases we will know the implementation 
because it'll be open source, and most companies are professional and 
consider obeying the law pretty important.


Jonathan Harley    :     Managing Director     :     SpiffyMap Ltd

Email: md at spiffymap.com   Phone: 0845 313 8457   www.spiffymap.com
Post: The Venture Centre, Sir William Lyons Road, Coventry CV4 7EZ

More information about the legal-talk mailing list