[OSM-talk] New OSM Quick-Fix service

Mon Oct 16 04:36:00 UTC 2017

Tobias, as promised, a thorough response.

On Sun, Oct 15, 2017 at 9:14 AM, Tobias Zwick <osm at westnordost.de> wrote:

>
> So, the initial question is: What is the conceptual use case for such a
> tool? Where would be its place in the range of available OSM tools?
>

I think my main target is the JOSM validator's "fix" button. The fix button
allows contributors to auto-fix everything that validator has found, even
without looking at it.  In order to actually see what the autofix did, one
has to select all modified objects, select them all in the "selection"
window, hit "history", wait for all individual objects to download, and
then view individual changes one by one.  It requires a great deal of
dedication and diligence, especially considering that these auto-changes
will be combined with all the other changes the user might have made.
While I trust that many OSM contributors are highly skilled, this
complexity may lead to errors, especially as some people might not know the
exact steps required to view it, cut corners, or think that the "fix"
button should know what it's doing.  Lastly, if I spot a bad autofix, I
have to go to the antiquated JOSM issue reporting site, create an account,
and file a bug. Not an easy endeavor for most of the users, so most would
probably not bother. So the "FIX" button is similar to my "SAVE" button -
users either catch it and do nothing, or they don't, and it gets saved, if
not by this person than by next.

There is the use case where one tagging scheme has been deprecated by
> community consensus and one (combination of) tag(s) should be changed
> into another (combination of) tag(s) globally.
>
> 1. If this does not require humans because both tagging schemes are
> mutually translatable (i.e. lets say for sport=handball <->
> sport=team_handball), then, the edit can be made automatically by a bot.
>

Here are a few of the existing JOSM autofixes done with my tool. See full
list at JOSM autofixes
<https://wiki.openstreetmap.org/wiki/Quick_fixes#JOSM_autofixes>.
* replace operator=ERDF -> operator=Enedis -- 5422 cases
<http://tinyurl.com/y9owhym6>
* use  "cs:" instead of "cz:" prefix for Wikipedia links -- 3 cases
<http://tinyurl.com/ya4nesbl>
* fix duplicate Wikipedia tag prefixes, e.g.  "ru:ru:Something"  126 cases
<http://tinyurl.com/ycxt6qnf>

While they probably should be ran by a bot, the barrier of entry is too
high to be realistic, especially for the smaller cases.  The very few
globally-licensed bot operators would probably not want to deal with these
small fixups, and for a very good reason - its not worth the risk! The
chance of a programming error far outweighs the benefits of the full
automation at so few objects. In addition to the programming error risks,
the community must have a far more thorough review of the proposal before
"bot-agreeing" to it - because what if there are corner cases that proposal
would break? This fear is what prevents the ease of bot adaption.

Lets look at a another example - a large 215,000+ cases autofix: removing
unnecessary "area=yes". These would greatly benefit from a bot edit, BUT
everyone makes coding mistakes, so there are some chances of a bad autofix.
If a bot owner makes a mistake, it can only be spotted AFTER running the
bot. A user would then post a message on the changeset, bot owner would
have to do a complex full/partial revert, fix the bot, and re-run it.
Painful. BTW, while doing these examples, I spotted a few potential bugs
with the existing JOSM autofixes that noone has reported - another reason
to put it through one-by-one accept/reject testing.

My tool would actually address these issues! When community first proposes
a change, it is relatively easy to add it to the tool - you simply write a
query and save it on a wiki page, possibly under the "proposed" section.
Then, many users can go through it one by one, accepting or rejecting them.
If there are rejects, anyone can go and fix the query, and the process
continues. Once this has been going on long enough, and there hasn't been
any rejects, some bot owner could simply run the exact same query on the
server, auto-applying it to the rest of the world. By that time, the query
has been well tested by many different members, and will be a much greater
quality than some bot author can ever do alone.

> 2. If this does require humans to check the transition to the new tag
> because the deprecated tagging scheme is ambiguous (i.e. , such as
> sport=football -> soccer or american/australian/canadian/... football),
> then, an automatic edit cannot be done. Instead, tools like MapRoulette
> are used.
>

I agree that my tool does not cover this use case yet.  I was thinking of
adding an option picker - a fairly easy task if the options are known in
advance, but this use case is not my primary target at the moment.

>
> 3. Finally, if this also does require humans because a tag combination
> is suspicious (what would show up as warnings in JOSM and what most of
> Osmose consists of), also, a tool like Osmose or MapRoulette is used.
>

The reason we already have Osmose, MapRoulette, and a few others is because
they cover slightly different use cases. I think my tool can simply take
yet another niche in this space, making some fixes easier than in
MapRoulette.

>
> Though, note, for all three cases, a prior consensus is required, either
> by prior discussion or by looking at what was previously agreed on in
> the wiki. That is the case for *any* organized re-tagging of existing tags.
>
Sure thing. There are very very few cases when the fix is super obvious,
e.g. a typing fix, but lets not dwell there.

>
> I reckon you see the quick fix tool to be in category 2 and 3 here,
> along with MapRoulette and Osmose, only with the crucial advantage of
> being quicker to use, since no editor is required.
> But it seems to me, you didn't think this through. If the tool offers
> *one* solution to any re-tagging ("Save" or leave it), then, this is
> pretty much a manually operated automatic bot (case 1), which really
> doesn't make sense. For case 2 and 3, it cannot be used as is, because:
>

> - Quick fix cannot be used to find what kind of football it is (case 2),
> but MapRoulette can, because it leaves the actual editing to the user.
>
> - Quick fix cannot be used to solve any markers which may or may not be
> an actual problem (case 3) because it has no way of marking any of the
> things as false-positives.
>

Per above, #1 and partially #3 are my main goals, but #2 is a stretch
goal.  The "reject" button has already been partially implemented, but we
need to figure out where to store the false positives.  For #1, I think we
should use a new "nobot" tag, because it will notify the community that a
proposed query has issues, and allow easier analysis of issues with all of
the existing tools. A bot owner can easily examine if there has been any
rejections before fully automating it.  I am not totally against creating a
separate false positive storage, but I do think it will be far less
beneficial to the community.

> Looking at your linked Wiki document (
> https://wiki.openstreetmap.org/wiki/Quick_fixes ), most of these are
> candidates for automatic corrections. I.e.:
> - Convert religion=Christian to religion=christian
> - Convert various common forms of religion=catholic to
> religion=christian + denomination=catholic
> - Convert religion=islam to religion=muslim
> - etc.
>
> (Only) your initial example ( amenity=sanatorium -> leisure=resort +
> resort=sanatorium for ex-USSR-countries) falls in case 2. But then, as
> mentioned, either marking as false-positives or other answer options
> (i.e. "yes, it is a sanatorium in the West European sense") are missing.
>
> I think you mean case 3. The initial example was meant to demonstrate the
capabilities, and was suggested by the RU community. In retrospect, I think
it was a bad initial example, as it derailed the tool merits discussion,
instead concentrating on a more complex cases.

> *However*, the idea as such, to make the clean-up process of either
> clearly wrong tags, deprecated tags or even just warnings
> semi-automatic, is a very good one. The prerequisite is, that there must
> always be the option to *not* apply that fix and save that decision. The
> other very critical point is, that the easier you make it for users to
> apply a predefined fix, the more precautions must be taken to ensure
> that the user really checked the situation.
>
> So, the most critical missing features from my point of view in your
> tool are
>
> a) There must be an option to manually edit this instead and/or marking
> it as a false positive. In any case, the marker may not be shown for
> other users anymore. This was a topic in this thread already and it was
> voiced that inventing new tags just to be used by this tool in not
> acceptable and I agree with that. The other tools also do not require that.
>

Agreed, and "semi-" done.  You can click the feature ID in the upper left
corner to be taken to OSM site for editing.  You can click the Level0 link
to "raw" edit the object, just like it is done in Osmose's "raw edit"
link.  If query uses "nobot" style tags, it will NOT allow you to edit it.
Not showing it can also be easily done by a slight modification of the
query itself.  This is assuming my reasoning for on-OSM tag storage makes
sense for other tools. In the mean time, I do plan to make a separate
reject storage system just to cover all cases.

> b) I strongly suggest to offer different answer options. As I said, if
> only one option is available, it is really nothing else than a manually
> operated automatic edit. If several options are available (i.e.
> "american football", "soccer" etc. ) as a quick fix, only then the tool
> becomes to be useful. (There are some challenges like that on
> MapRoulette also, such as "Phone or fax number is not in international
> format" and these in my opinion also do not belong there because they
> can be solved automatically)
>

Agree, answered above.

> c) Require users to zoom into the map at around zoom 17 or more to make
> any changes. If the users are supposed to check if something is the case
> (via satellite image), then at least don't let them cheat by just
> solving everything from looking at continents.
>

This was fixed several days ago. The save button is there, but it won't
save unless you zoom in to 16+ (just like iD editor).  Plus I added Mapbox
satellite imagery to help.

> d) Finally, I think it does not make sense to have any quick fixes in
> that tool that require actually going there (as opposed to looking at
> the satellite imagery) because the effort to go there actually (let's
> say 20min if you happen to live in the vicinity) is dimensions higher
> than clicking on the "Save" button (1 second). The temptation will be
> big to simply click on that button without actually checking it. If you
> actually go there and check, then, the 1 minute as opposed to 1 second
> you need to get the surveyed result into the map through iD/JOSM does
> not really matter in comparison.
>

Of course. But this will be up to the communities to enforce - if someone
writes a query like that, others should be quick to point this out.

>
> All in all, in my opinion, the best way to go forward from here is to
> take this idea of quick fixes and instead of creating an own tool that
> is otherwise very similar to MapRoulette (because it must for being
> useful, see above), propose it as a feature to MapRoulette, discuss and
> implement it together in accord with the MapRoulette team into their
> tool (or Osmose for that matter). It's all open source.
>
> I have extensively discussed this with the MapRoulette's author and sole
maintainer, both on chats and via a video call.  The reality is that sadly
Martin doesn't have as much time as it would require to adapt MapRoulette
to even remotely match the capabilities offered by a full blown, industry
standard SPARQL query language, nor would it actually be practical, as I am
trying to solve only a specific portion of what it targets.

> That feature could look like that the creator of a MapRoulette challenge
> may optionally provide a range of possible (typical) answer options
> ("quick fixes") which are then shown as additional buttons right next to
> [Edit], [False Positive] and [Skip] for every place within a challenge.
> I.e. for football, it could be a dropdown of soccer, american_football etc.

Yep, case #2 - a nice to have stretch goal, per above.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20171016/9df9572c/attachment-0001.html>