[Rebuild] Large relations - rules and tools

Richard Fairhurst richard at systemeD.net
Wed Feb 1 11:20:37 GMT 2012


Hi all,

Relations have thus far been the poor relation of the relicensing project:
neither OSMI nor P2 visualises them and there hasn't been any great effort
to "recover" tainted ones.

I'm not greatly concerned about small relations such as turn restrictions,
associatedStreet and the like. These will generally not have too many
contributors and so the risk to their integrity is directly in line with
the % of mappers that accept the CTs. What I'm more interested in/worried
about is the massive route relations, which will typically have hundreds
of members and also hundreds of revisions (particularly if they were
edited with Potlatch 1 ;) ).

== Rules ==

Firstly, what are the rules for determining whether such a relation is clean?

http://wiki.openstreetmap.org/wiki/Open_Data_License/What_is_clean%3F
suggests that any relation created by a decliner will be considered
tainted. This seems out-of-place with how route relations are mapped. Such
routes are usually researched non-contiguously: in other words, if I
encounter (say) a section of National Cycle Network route 48 out in the
countryside, I will map it by searching the relevant wiki pages for the
relation ID, load this relation expressly into P2, and then add the new
section to it. There is no dependency on other contributors' IP at all.

A real-world example: http://www.openstreetmap.org/api/0.6/relation/69318/1

This is the South-West Coast Path (currently 1700+ revisions and 1500+
members). v1 was created by a decliner with just one way. But it is
clearly not tenable to say that all 1700 revisions are a derived work of
that original IP.

Therefore, I'd like to suggest a modification: route relations are assumed
to have a clean v0 which is an empty, untagged set. v1 is treated as a
modification of the relation (adding tags and members).

== Remapping tools ==

In order to be able to remap large relations, the mapper needs two things:
   1. visibility of tainted members
   2. visibility of tainted changes to the relation

1 is well catered for by the current tools, though it may be handy to have
a view per relation (i.e. a list of members with licence status by each
one).

2 I think is not yet provided anywhere. I guess it needs a full-history
database to be done properly: if all else fails I will write a nasty Perl
script that will download each version in turn from the API and diff them,
but that would be very_horrible, as our tagging@ friends say, and probably
result in my being banned from the API. Any thoughts?

cheers
Richard






More information about the Rebuild mailing list