[Imports] Harmful elements in taginfo tag cleanup process
ilpo.jarvinen at helsinki.fi
Tue May 5 21:36:48 UTC 2015
On Tue, 5 May 2015, Bryce Nesbitt wrote:
> Eventually, I feel the project will need to move to either
> a reputation system (where others rate the quality of edits) or
> a double approval system (where a second mapper must endorse an edit before
> it goes live -- perhaps only
> for edits that remove information).
Right, I'd mostly want to limit the power of single person doing a
removal. However, even double approval might not effectively filter
much if such a process collects similarly minded cleanup focused
persons more likely than other types of mappers.
> If you have a tag that's commonly deleted, then try adding a wiki page for
> it. That might help.
I've thought that too. However, I think this puts the burden on incorrect
end in the loop. Given that many of the deletes seem to occur for words
with are well known in local context it would be duplicate effort to
document them rather than finding the "global" alternative which is hard
than some might understand (especially for non-natives).
...If some tag gets deleted very often I might consider that though.
> If you list key names that have been improperly deleted, perhaps that will
> give a clue also.
> Do they all have funny characters? Are they all seemingly non-English? Is
> there a pattern?
Not really (I'm listing them below with lots of thoughts too) but I think
these details are irrelevant as the proper action would be replace them
with a tag that is intented for "global use" (if one is applicable). Only
thing common seems to be that there are quite few of each in the DB so
every tag which is not used very frequently might get targeted I suppose.
My main point is that these "fixers" did not do replace but deleted
instead which is what gets me alarmed as then nobody else has the
opportunity to replace either (I admit that one seemed to tried very hard
to figure out local context but most seemingly didn't that much :-().
Sadly they also seem to claim/think they "fix" stuff by removing given the
Below is a verbose list of case I remember/have had changeset comment
exchanges with (this is rather verbose list with my own comments on each
case, so please don't read if you have better things to do :-). And BTW,
I've also listed those few good deletes in the end):
Name of a local state sponsored subsidy system. The naming comes from
Finnish Law and is well known to almost any local who is past childhood
already. This have been removed twice (no odd characters, used just
few times and incomprehensible word which neither of the deleters
understood in the first place). However, I think that perhaps fi:arava=yes
might be better tag for this or some form of "global"
building:subsidied:system=arava or like (if wiki has something along those
lines I don't know of) but it's noteworthy that none of the deleters have
changed it to these alternative (so this kind of "fixing" is useless
without local input that should IMHO be asked or waited patiently for
rather than forced by deleting keys). Tricky to acquire after the building
is completed. The few that are currently in the DB are mostly based on
information that is available during construction time (otherwise very
local local knowledge is essential).
Means rented apartments (in contrast to different forms of ownership
of the apartments in the building). Useful at least for statistics
purposes and is not easily available in many cases (as such it's every
valuable to acquire from some legal source). An incomplete "fix" was
tried which changed this to rental=yes (some of them were changed and some
deleted in the same changeset deleting arava=yes) but I don't know if
that's correct or not as there's no wikipage for that either. Also this
has been deleted twice already.
Name of a blinker device that is tested for highlighting highway=crossings
in sensitive places such as near a school. Few were installed for testing
purposes. It might be same as flashing_lights=yes but given the lack of
what exactly it means (does it cover traffic lights type of flashers only)
it's unclear to me. However, flashing_lights was not discovered by the
fixer him/herself but me so the "fix" effort should not be credited over
locals here either! Also, flashing_lights is rather recent addition
compare with välkky addition and I don't think I've seen it e.g. on
tagging@ so it's unreasonable to assume that locals would learn everything
from wiki so quickly for the tags they're already familiar with.
In addition, the user who put välkky to db explained that multiple device
types are/were tested, this was just one of them and it is/was unclear
if one would be selected over another eventually, if any (which might mean
dismantling the other types so removing type information would make
further update more complicated).
Were tagged to a highway=path. Claimed to provide no added value by the
fixer. I somewhat agree but again, this decision to remove information
that was acquired through local survey was removed by a decision of an
individual only which I find dangerous practice even if I somewhat agree
with the reasoning done by the fixer.
Underground waste collector brand well known to locals. This delete was
updated to something better in a follow up change by the fixer, which I
find very positive experience (he even looked the "global tag" up all by
himself rather than encumbering locals for the work that the fixer should
follow through, IMHO)!
name:francais (or some form of that with a fancy non-Finnish letter)
Probably some editor auto-completion issue. Should have been changed
to name:fi rather than deleted but as the fixer didn't understand local
context he/she thought that it's simply a duplicate as name was
already equal to the value. Not a big issue since the name still
persisted but again highlights how important it's to understand
the local context when deleting something. Based on the follow up
discussion it seems that the particular fixer even lacked
understanding on how the naming works in bilingual countries as the
information was claimed to be "redundant".
was:*=* (we use these to prevent remapping form imagery, not for
history preservation like the "fixer" kept claiming multiple times).
It has been useful for me personally and I've even encountered one
incorrect "redraw" of a feature (which is not that likely given that you
need to do the survey and detection of the "redraw" yourself as anybody
else would just correct the redraw damage without detection for
"redraw" event taking place).
I asked the user adding these and some statistics/correlation related
use case was visioned by him. Obviously the data is in no way complete
but it's not incorrect either. I understand that some here probably
disagree that this should be kept in the DB in the first place (but
remember that you were not asked by the "fixer", it could be tags you'd
like to keep next time).
I remember seeing two clearly good removals so far (there might have
been one or two others but I fail to remember more as I have no formal
exchanges with the fixer about them):
building=residential or building=apartments information duplicated
into some other key that was deleted by the fixer.
Details about mapping related accidents I used two times. The irony is
that I've added those (if you want to know why, this is my hobby only so
just had some fun back then).
...and plenty of those fixes that really fixed typos, updated to better
"global tags" and such but that's out of scope w.r.t. deleting tags.
More information about the Imports