[Talk-ca] Cleanup

Paul Norman penorman at mac.com
Tue Mar 6 22:58:24 GMT 2012


> From: Richard Weait [mailto:richard at weait.com]
> Subject: Re: [Talk-ca] Cleanup
> 
> On Fri, Mar 2, 2012 at 12:57 PM, Paul Norman <penorman at mac.com> wrote:
> >> From: Richard Weait [mailto:richard at weait.com]
> >> Sent: Friday, March 02, 2012 5:42 AM
> >> Subject: Re: [Talk-ca] Cleanup
> >>
> >> On Fri, Mar 2, 2012 at 4:10 AM, Paul Norman <penorman at mac.com> wrote:
> >> >> On Wed, Feb 1, 2012 at 2:36 PM, Richard Weait <richard at weait.com>
> >> wrote:
> >> >> I suggest that we can have more tainted data removed
> >> >> automatically,
> >>
> >> > It appears that the bot is deleting ways more extensive than those
> >> proposed.
> >>
> >> Ah, that's my fault.  I miss-stated the scope when I requested the
> >> removal.  The result will be the equivalent of the automated cleanup
> >> after the license change, for those three accounts.  I'm inclined to
> >> have it continue; this is making remapping faster / easier.
> >> Apologies for my confusing the issue.
> >>
> >> Last time, some nodes were cleaned up after the ways were removed.  I
> >> expect this will be similar.
> >>
> >> Best regards,
> >> Richard
> >
> > I'm not sure what the best way to proceed.
> 
> I think that it is.  We're just a few weeks ahead of everybody else.

I see five issues with it.

1. The changesets to remove the nodes will be much larger than the
changesets so far which have not dealt with nodes. Backing out is quicker
than proceeding;

2. The removal makes remapping much more difficult. Since the changesets I
have stopped all remapping on the island since I no longer known what needs
to be added to replace missing data or data that will be removed. Previously
I was working on the island because it took much less time per square km and
much less time per object to remap. It's much slower now, so I've given up
on cleaning up the island.

3. Concern was expressed in the rebuild@ conference call about the method of
removing areas of data and not replacing them. I'm not sure I share this
concern, but I don't know if there's a consensus on the matter.

4. Concerns over deficiencies in the algorithm, which I'll get into below.

5. People are still discussing refinements to the algorithm on rebuild at . For
example,
http://lists.openstreetmap.org/pipermail/rebuild/2012-March/000099.html is
discussing a less conservative approach to some tag changes.

6. Not all of the objects removed needed to be removed. I was adding
odbl=clean as appropriate. I can't do this anymore.
 
> > Proceeding with the node removals
> > will still leave stray nodes as I believe there are false negatives in
> > the license change algorithms used.
> 
> Do you have examples so that the algorithm and process can be checked?
> 
> > When I suggested a second run targeting items where all versions were
> > created by one of the accounts it was because the first run had
> > essentially missed some objects.
> 
> Did you present examples to check?  Again, if the process or algorithm
> can be improved, perhaps we should woork towards that?

I have two separate concerns, false negatives and stray nodes.

I see two ways for false negatives to arise. The first is when a way and the
nodes that form it is created by a decliner then another mapper drags or
rotates the way to correct an offset. This can be done without retracing the
way, particularly if it's the decliner traced from a source with a known
consistent offset. The second way for a false negative to arrive is when an
object from a decliner (most likely a POI node) is retagged by a bot like
xybot. Xybot runs automatically and I don't see how it can create remove old
IP and create new IP by an automated retagging.

The second concern is stray nodes. These could occur through false
negatives, but with the algorithm used they can occur even without false
negatives.

Suppose a decliner creates a tagged way and an acceptor later adds new nodes
to refine the geometry untagged nodes with untagged nodes, no tag changes to
the object, and no moves to existing nodes. The way is then removed, but the
nodes created by the acceptor remain. This is fine from a legal standpoint,
but it leaves these nodes as stray nodes unless they are the child of some
other object.




More information about the Talk-ca mailing list