[OSM-dev] Simple automated edition of ODbl data of Poland (Was: Why redaction bot deletes tags set in first version by CT-accepted user ?)

andrzej zaborowski balrogg at gmail.com
Thu Jul 26 19:26:45 BST 2012


On 26 July 2012 09:44, Mateusz Korniak <mateusz-lists at ant.gliwice.pl> wrote:
> On Saturday 21 of July 2012, Frederik Ramm wrote:
>> On 21.07.2012 15:59, Mateusz Korniak wrote:
>> > http://www.openstreetmap.org/browse/way/30952755/history
>> > the redaction bot deleted tag:
>> > highway = residential
>> > ?
>> >
>> > It is assigned in first revision by CT-accepted user.
>>
>> I guess that the bot was tripped up by a change to highway=unclassified
>> in version 3 by a disagreer. It could be argued that in this situation
>> it would have been better to revert to the initial highway=residential.
>>
>
> I am thinking about implementing simple program to review changes made by bot
> and fix scenarios as above by reverting back to last CT-accepted version of
> tag value.
>
> Is there any complete dump of ODbL licenced Poland (with history)?
> Do you know about nice example of selecting over such dump histories where
> given user made changes (written in Python) ?
> Do you know about nice example of code editing tag values (written in Python)
> ?

Now that I think of it, the information on which historical versions
have been redacted is not available from the redaction diffs.  So the
implementation of such a bot will not be possible until a new redacted
is available (unless you want to run the redaction bot locally to find
out what versions are clean).

In the mean time I've generate a file containing all the ways and
nodes deleted in blacklisted changesets or deleted by decliners.  Some
of this is good data which has been replaced by imported data which
was later purged (but most of it is garbage).  Care has to be taken to
not undelete objects that are dirty though.  The file is at
http://a.osm.trail.pl/undelete-2.osm.bz2

and contains only Poland, if anyone needs such data for other areas,
let me know.

It has been assumed in the redaction process that undeleting objects
deleted by decliners would produce more garbage but as far as I know
nobody has tested this hypothesis so it's just somebody's guess (same
as so many other assumptions made in the redaction process).

Cheers



More information about the dev mailing list