[OSM-talk] Mechanical Edit?

Tue Jul 27 10:20:53 UTC 2021

Hello.

First message on OSM-talk, not sure whom to reply. I started writing it on
Sunday, so this message replies to entire topic, not previous message.
Little bit is related to the other discussion in StreetComplete thread as
well. Second half of the mail replies to messages sent today.

I understand this discussion here evolves around two issues: replacing
American spelling of a key with British spelling and second one using too
large changeset(s). I’m not here to make statement on first issue, because
it’s trivial for data consumers to scan for second extra tag when looking
for colour tags. What annoys me is general dislike of large changesets in
OSM community. If everyone loves to complain about large changesets, why are
there no tools to mitigate accidental creation of large changesets? Why is
no editor splitting large edits into smaller changesets by default?

When trying to upload changeset with large bounding box, Josm shows yellow
warning about too large bounding box. That warning is shown right above the
final upload button. Considering UI of Josm, I assume most users see that
warning *after* they have made all those changes. When I first saw that
message, I thought the positioning of the message suggests that splitting
changeset into subareas is something very simple and can be automagically
done with just few clicks from the very same upload dialogue. People who
write comments under large changesets are fighting with effect, not cause.
If you want to get rid of global changesets polluting your changeset history
feed, you need to address problem that allows them.

Same applies to mass changes. If these are bad find-replace must not be done
with ease, why

Main argument against large changeset is to make review process easier.
Mostly for filtering out locally relevant changes. If people, who are
monitoring single city, don’t want to see global changesets in their
location’s changeset feed, why haven’t they made tool which can easily
filter those out? Most basic filter is to ignore changesets with bounding
box in different hemispheres. Bit more advanced way is to process planet
diffs and from there filter out regions in which they are not interested.

Other analysis method that is used as example in aforementioned argument is
to make analysing contents easier as otherwise tools like OsmCha wouldn’t
even open enormous changesets. Regarding osmCha, I understand basics of why
it can’t be made to support larger changesets (due to OSM data design).
Maybe it’s time to ditch APIv0.6 altogether and move to version 0.7? Then
we could to make history analysis faster by start storing both new and old
versions of elements within changeset. We could also address inaccurately
large changeset bounding boxes by replacing them with minimal convex hulls
or even heatmaps. To handle updated geometry, changesets could also store
data on node locations as well. I’m looking for discussion of better
development directions for OSM-related software. After all, like in most
open-source software communities, some members of OSM community like to say
that the only way to make a change is to change it yourself (and then hope
your PR gets merged).

TLDR: Why complain over large changesets in changeset comments if you could
just submit PR to fix cause of problem at editor level?

TLDR of 2nd part: Commenting on messages sent today and yesterday.

While Frederik has valid point about horse/kow/cow, but he misses main
characteristic of OSM: every element (and it’s tag) has geographic
location. I’d say better example which I recently encountered might have
been roof:colour=青. Go ahead and paste that value to online translator. You
will be probably told that this means green in Chinese. Except tags like
that were located in Japan, where 青 means blue. Furthermore, in case of
kows vs horses how could anyone other than original mapper link kows to
horses? That makes tag virtually useless, if not revert-worthy vandalism.

Tomas is definitely right about one-by-one analysis. Question is what
suffices for proper analysis and how does other party verify if tags changed
were subject to analysis or not? If someone makes large changeset to fix
typos in loads of tags, then it might be reverted regardless if each change
was thoroughly investigated beforehand. How does reverter know if previous
mapper did not look at each individual tag before changing?

> Students with no education can run updates - it is extremely easy to do,
if (and where) that would have been the way - it would have already been
done by more experienced people.

Judging by some smaller OSM communities, IT/CS students make up a large
portion of regular OSM users, other large groups being veteran mappers, who
started mapping in 00s (and are now spending most of time running their
OSM-powered GIS companies), and corporate mappers, who specialize on narrow
topic benefitting the company they are working for. Since two former groups
can’t agree on if tagging schemas should be centralized or localized, then
why not resort to third group and let Amazon et al make decision?

Btw, why emphasize lack of education? OpenSM was supposed to be global
community accessible to anyone in the world, namely for developing nations
where access to internet nor education was not commonplace.

Next 5 emails in the thread could be summarized with following quote:

> “Currently there is no way to deprecate ANYTHING. Because there is no way
to ask a considerable amount of mappers for opinion (directly or via
representatives or via expert group). And there is no will to find such a
way.“

In theory there is a way. Time to introduce forced OSM-wide voting
procedure: All OSM mappers will get automatic indefinite user block. In
order to restore access they must click link in the block message, which
will lead to personalized survey site, where they must cast decision. After
decision is made, block is lifted. Partially Tomas is right as that UX can
not be used enforce free-text type of replies as users could simply reply
“Asd” in the form.

Most of individual tagging typos make up less than 1/5000th of all
occurrences of that tag. Simon Poole is correct these errors having
minuscule effect, but I fail to understand why should we keep invalid data
in database just because it won’t affect around 100% of use cases? Here’s
potential for APIv07: each tag gets meta-tag to tell when was last time each
tag was changed and in which changeset. Previous versions are stored at
changeset tags.

Best regards,

Fghj753

PS. Since many of you claim that wiki is worthless source for tagging
guidelines documentation, I hereby propose global tag upgrade to replace
building=yes with tag 123=asd. Remember, wiki is much more accessible as
source for documentation than any mailing list. If you don’t like wiki
format, propose better alternative and let others tell why your approach is
terrible idea. For example, I’d propose taginfo because it represents
actual tag usage rather than outdated abandoned and rejected tagging
proposal.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20210727/036e4e7b/attachment-0001.htm>