[OSM-talk] [OSM-legal-talk] licence change w/o data loss + better version control + more quality with flagged revisions

ce-test, qualified testing bv - Gert Gremmen g.gremmen at cetest.nl
Fri Dec 17 15:08:45 GMT 2010


I would greatly encourage this.
I would also like if it could show the results 
if one (odbl) or another license (PD) was accepted and non-compatible data removed,
partially, or as a whole, dependent on the acceptance of the license / contributor terms
by the author.

It would show the community the consequences of an unthoughtful decision,
be it CC-BY-SA, PD or ODBL.
Especially now an important part of the community is pushing towards a choice.

Gert Gremmen
-----------------------------------------------------

Openstreetmap.nl  (alias: cetest)
 Before printing, think about the environment. 

-----Oorspronkelijk bericht-----
Van: legal-talk-bounces at openstreetmap.org [mailto:legal-talk-bounces at openstreetmap.org] Namens Heiko Jacobs
Verzonden: Friday, December 17, 2010 12:14 AM
Aan: legal-talk at openstreetmap.org
CC: talk at openstreetmap.org
Onderwerp: [OSM-legal-talk] licence change w/o data loss + better version control + more quality with flagged revisions

Hello

I want to add again some remarks to licence change
combined with some ideas to qualitiy management and version control.
It might be also interesting in combination with a bacchelor thesis
http://forum.openstreetmap.org/viewtopic.php?id=9817 (in german!)
or in combination with the rough view to state of licence change
http://osm.informatik.uni-leipzig.de/map/?layers=B0

For all of this you need
- decisions how to manage licence change and
- tools for this.

For the last point we need two tools
- get a complete version control of objects
- decision of the licence state of an object

Does anyone on this world already works on such tools? I not heard about it.

First tool is needed because of splitting and joining ways.
At splitting one new way will appear with a new history and with a
new user, if this user did something on this way or not (besides
splitting it). For joining the history of one way is lost.

To get a complete history is very awful today. You might get
it by viewing all nodes, but this may complicate ...

Such a tool would be useful already toay for some questions:
E.g. since when something ist changed at an object and who did it?
So it would be nice to get such a tool
- independant from licence change
- not only for getting old history for licence change
- for future use, too

For licence change, a stand alone tool may good enough,
but then you will get no intermediate states of licence changes progress

For last point, the API has to learn it, if the editing tool
don't gives hints ...


Now we have the first tool ;-)
Next point: licence change

Who needs the new licence?

The mapping-only-user not really, because he is already mapping
under CC and this seems good enough for him ...

The users of most data also not realy, becauss most things
like slippy maps online and on Garmins already work with old licence

But there is a circle of adavanced users of OSM data, who want to
put date in special tools, might be mixing them with other data
under other licences, and they don't want to get problems with
unclear licence ...

I state, that the new licence will be mostly better like the old one.
especially for this group, but for others, too, and it would be
good to hurry up.

I only have problems with loss of data or (this might be more worse)
failures in data because of removing all data, where no one can relicence
it because of death, not reachable anymore or left project, ...
So some geometry, tags, objects will disappear. If only some
nodes or tags of a way disappear, the rest might be very faulty ...

For the circle of special users the new licence is really necesary,
but the most users will only get angry, if they will see, which
date dissapears or is wrong other change of licence.

So I already discussed some ideas to avoid this, but with no luck yet ...
So I try it again now ;-)

One tool has still to be discussed:
Which licence has an object?
There is a longer list of problems for this decision. I will not
discuss them now, because
- bacchelor thesis, which may give answers?
- this problems are not interesting for my idea.
I only want look at the result. This may be:
- object clearly 100% OdBL
- object clearly 0% ODbL
or might be
- 100% ODbL
- 90% ODbL
- ...
- 100% CC
or
- object is 100% PD
What exactly has to be discussed, especially for data edited by
more than one user, totally unsolved yet.

Up to now the result will be ONE ODbL-only data set.

But why only once and "so binary"?

I think, the process of decision will be programmed as a process,
who may be startet more than once, becaus a lot of mapper
is interessed to see, what will be the result, if
"up to now 47110815 mapper accepted, the new map will be
http://... if no more will accept"

Then we have nearly something, that can be used to put this
information back to databse instead of extracting a new database
So a <way id=...> <node id=...> <tag k=... v=...> or whatever will
be verified, will get a new l=... (licence=...)
l=o, l=c, l=p or variations like l=o50 for 50% ODbL ...
might be also something additional, which can express, WHY an object
is only 50% ODbL
What exactly may be stored ... May be the bacchelor thesis will
gave us ideas?

Then we can do two things:

A.
All data stays for ever in OSM, the CC-data, too.

The users who need data independant from licence, will use all of them.
Most of slippy maps for example.

The users who need only-ODbL-data don't uses the complete planet.osm
but uses planet-odbl.osm or planet-pd.osm (!) or planet-odbl50.osm
if they need whole world. Extracts already exist for regions
(europe, germany, ...)

The users who need only small parts may ask the XAPI like today,
if they want to extract highway=residentials. Now they may use l=o

So no data loss or failure is necessary.
The users, who need ODbL-only will get this more earlier,
because we don't need to wait for 99% acceptance of new licence,
they may start with 75% of data (better than to hav eno ODbL-data...)

Tow effects will force the growth of ODbL-data inside this mixed database:
- more and more new users sign ODBL compatible terms
- more old users will accept new licence, because of no data will be lost
We also still can (and should) decide, that starting at 2011-x-x only
users are allowed, who accepted.

But that's not all. Let us also look to:

B:
If the data has a l=o or something like this, you also can view it
on a map or editor!

The german Wikipedia knows "flagged revisions", in english Wikipedia
this wa discussed, but not used?!
Besides better version control (look above) this might be the
second thing taken from Wikipedia.

What does a standard mapper today?
He is mapping missing things
- missing nodes, if a curvd street still has "corners"
- missing ways completly
- missing tags
A mapper normally will not map existing things
- besides geometry is faulty
- besides tags are faulty

The idea to map also the already existent things was first born at
discussion of licence change to get clean data. Stupid idea ...

Besides licence change we have another problem, that may be solved:
An object created at one time will have this time stamp. If this is
5 years old, we don't know
- if this object is really unchanged
- if this object is outdated
Just now there is no possibility to flag an object as still existent.

If we can view on a map or in an editor the age of an object or the
licence state of an object and create such a "flagged revision" we may
- confirm, that we are able to map the object again under new licence
and/or
- confirm, that the object still exist

Might be that we have to distinguish between confirming
- geometry
- tags
First one from own GPS tracks or aerial images, second one from
personal knowledge. Or more detailled (I can confirm the surface,
but mot the name of it, ...)

We will get problems to hold our data actual. In lot of areas
the data seems to be complete, so no one looks at this areas.
So no one will notice, that a area is outdated.
With viewing the age of data (besides licence state) one can find
areas of possibly outdated data

And if we have doubts on existence of an object, we also may unflag
a revision.

We also have to design the tool, which decides, if an object
is 100% ODbL clean, for using permanently to
- work with this flagged revisions for licence state
- work with normal edits (moving nodes, changing tags)


advantages:
- avoiding loss of data
- sliding change of licence
- problem of commonly edited objects may be less large for the most ones
- ODbL data more early available
- data protection more early because CC-only database is away more early
- PD data may grow inside OSM
- the only way to extract PD data from OSM?
and:
- more quality using better version control and flagged revisons

disadvantages:
- ODbL-only data base takes longer
- more load for running system

The disadvantage of CC data staying in OSM will may shrink faster
like filling gaps from loss of data if removing them.

If the running system may be a bottle neck by adding such a sliding
check of licence stat, revision control and flagged revision,
an expert of the API has to check ...

I will crosspost it to some channels of discussion:
talk-de, forum and wiki in my language german
talk-legal and wiki also for a little bit smaller english version.
I hope this reaches the right persons, feel free to spread it.

http://forum.openstreetmap.org/viewtopic.php?pid=127300 forum copy english
http://forum.openstreetmap.org/viewtopic.php?pid=127301 forum copy german
Wiki copy
http://wiki.openstreetmap.org/wiki/Talk:ODbL/Upcoming#licence_change_without_data_loss_.2B_better_version_control_.2B_flagged_revisions

Greetings Mueck


_______________________________________________
legal-talk mailing list
legal-talk at openstreetmap.org
http://lists.openstreetmap.org/listinfo/legal-talk


More information about the talk mailing list