[OSM-legal-talk] ODbL and publishing source data
jon at spiffymap.net
Wed Nov 30 14:18:37 GMT 2011
On 30/11/11 06:02, James Livingston wrote:
> On 30 November 2011 01:03, Jonathan Harley <jon at spiffymap.net
> <mailto:jon at spiffymap.net>> wrote:
> On 28/11/11 23:59, James Livingston wrote:
> Depending on the rendering, it may not be the same. The
> placements of name text can depend on other data so it's not
> on top of something else, or POIs can be hidden if there are
> too many in a given area.
> In the first case (or combining layers in the browser), the
> rendering of OSM data cannot depend on the location of your
> hotels, and the rendering of hotel names can't easily depend
> on what else is on the map. In the second case (combining data
> before rendering) collisions can be avoided or the resulting
> map altered.
> Yes, but it's only the produced work, the rendering, which is
> altered. You probably don't need to make changes to the OSM data
> to acheive this.
> So the OSM data and other data could remain independent. If they
> do, then the mechanism for combining (and computer/s on which it
> happens) is indeed irrelevant.
> While that's true, you are combining the two datasource together prior
> to rendering.
> Say I created some local postgres database tabled and loaded OSM data
> into it, and then loaded data from another source into it too. What I
> have is now a database derived from both OSM and the other source. If
> I then rendered that data to create a Produced Work, would my combined
> database not be a Derived Database?
No, not if the data remained unmodified and independent. If data were
changed - for instance duplicates deleted, or names or co-ordinates
changed because one source was considered more accurate than the other -
then yes, that's a derivative database. But if not, it's a collective
database. Whether they're in the same instance of postgres or not is
irrelevant, because software is not a database for the purposes of
database law (at least in the EU). Only the data itself constitutes a
By way of analogy: suppose I sent you a private email which included a
license saying if you publicly use my email, you must share with me any
other emails you combine it with. My email sits in your inbox together
with other emails, and you can do searches across all of them. If it's a
unix system, they're probably all in one single file. But have you
really "combined" my email with your other private emails? My email is
sitting there unmodified and completely independent of all your other
private emails; it is not itself combined with them. So no, "storing
next to" is not "combining". You can safely share a screenshot of your
mail program without having to send me all your other private emails.
> This was discussed on legal-talk a few months ago, and my
> opinion was that it depended on whether you could produce the
> same output by merging separately-rendered Produced works. If
> you can get _identical_ output by merging layers on the
> browser side, then it's okay to the merging on the server
> side. However if you can't get identical results by merging
> the rendered output, then you've obviously combined the
> databases prior to rendering.
> Not necessarily. For example, the rendering might depend on what
> order data is rendered. But the data being rendered would remain
> independent of each other; it may be only the rendered result
> which varied. And that's a produced work, not a database.
> Can you get the same result by rendering the first dataset (creating a
> Produced Work), rendering the second dataset (creating another
> produced work, if it's ODbL too) and then combining the output? If so,
> they're definitely independent. You can render the second dataset
> first if you like provided you combine them in the right order.
> If the rendering of the second output depends on the first dataset,
> the Produced Work created from the second dataset is not independent
> of of the first dataset.
No, the produced work isn't independent of it, but the datasets are
still independent of each other, that's my point.
> I guess it's possible the rendering algorithm for the second dataset
> could use the Produced Work from the first rather than the first
> dataset directly, which may be okay except that it's arguable whether
> you are reverse engineering part of the first database.
Yes, I think that point's arguable - if the combined produced work is
based on two previous renderings from the two datasets, you might be
able to argue that those renderings are themselves databases,
particularly if they use vectors - though it's a very murky area,
because rendering usually involves throwing away lots of information.
(For example, suppose two roads meet at a T junction, and I render them
both in the same line style without names; you can no longer tell their
IDs, tags, or even how many ways in OSM make up that rendered T shape.)
> Having two instances of say Postgres and having one program
> that reads both and renders is still creating a derived
> database, even if it is only in the memory of the rendering
> It might create a derivative database, or it might not; it would
> depend on the algorithm. If the OSM data remain unmodified, then
> it could be creating a collective database, which is explicitly
> not a derivative database.
> I guess that's the a question: if you write a program that reads data
> from two sources and uses both to produce it's output, are the
> temporary in-memory data structures considered derived or collective
> for the purposes of copyright and database right law?
Yes, I've puzzled over that one too. If in-memory structures count as
derivative databases, then that would be the one you would have to ask
for under ODbL, which only requires licensees to release the last in any
chain of derivative databases. It would make that whole side of ODbL
pretty much impossible to enforce if so.
> The answer probably depends on how the program is implemented, but
> given that we won't know the implementation, how can we ever determine
> whether someone's Produced Work requires them to release their
> database? If we say we can't determine that, aren't we essentially
> saying that it's impossible to enforce that part of the ODbL?
Even the attribution part of ODbL isn't necessarily easy to enforce - I
suspect that the more complete OSM gets, the more difficult it will be
sometimes to tell that a map with no attribution used OSM data. And yes,
the share-alike part is much harder to enforce that that. It's not so
gloomy really, though - in some cases we will know the implementation
because it'll be open source, and most companies are professional and
consider obeying the law pretty important.
Jonathan Harley : Managing Director : SpiffyMap Ltd
Email: md at spiffymap.com Phone: 0845 313 8457 www.spiffymap.com
Post: The Venture Centre, Sir William Lyons Road, Coventry CV4 7EZ
More information about the legal-talk