[Rebuild] Tests of rebuild process - we need your input and help

Dermot McNally dermotm at gmail.com
Fri Mar 23 00:57:08 GMT 2012


Folks,

As mentioned, we hope to perform real data tests of the database
rebuild code this weekend. We will be looking for many eyes to
scrutinise the resulting data changes for compliance with the
publicised criteria as embodied in the tests:

https://github.com/zerebubuth/openstreetmap-license-change


We _also_ need, in advance of the tests, the following items that some
of you will be in a position to provide. As this is just a test run,
the data sets mentioned need not be the final frozen versions, but the
closer they are to reality the more useful the tests will be.

What we need:
===========

* A "suspect item list".


This is a list of OSM objects, including deleted ones, which have
somewhere in their edit history a version that is not ODbL-clean. Put
differently, a list of all objects except those whose entire history
involves known agreeing mappers. The purpose of the list is to speed
up processing by not bothering with known clean objects. Format is
unimportant, a text file is fine. Suggested format something like:

node 123456
way 654321
relation 555555

But if you already have something more or less parseable we will
happily take it as it is.

Frederik or Simon, you both might have something a little like this, right?


* A "Changesets to be exceptionally considered clean" list

Simple text file of changeset IDs would be ideal, but again, anything
close is also fine.

Frederik and Simon, I know that you each heed these IDs, so again I'm
hopeful one of you can assist.



* A "Changesets to be exceptionally considered _non_ clean" list

I don't know whether anyone has begun to maintain one of these, but
Simon at least has flagged the problem of agreeing mappers some of
whose changesets may be from problem sources. If anyone has been
making a list of these, we'd be happy to have it. If not, and you
think this is an issue we need to address, now would be a perfect time
to start compiling one. In this case, a really short list would still
be useful to test our logic.



* Lists of _Objects_ deemed exceptionally clean (or unclean, depending
what is being compiled)

Again, this may not yet be available, but it refers to the list
supposedly being compiled for data originally imported from UMP
(Poland). Once again, if we wish to support such a list we will need
to source it. It may in any case be too late for the weekend tests,
but again, even a representative sample would give us a basis for
testing.



That's the lot, thanks for any support you can give,
Dermot


-- 
--------------------------------------
Igaühel on siin oma laul
ja ma oma ei leiagi üles



More information about the Rebuild mailing list