[Tagging] iD presets

Fri Jun 22 20:08:38 UTC 2018

On Fri, Jun 22, 2018 at 2:25 PM Christoph Hormann <osm at imagico.de> wrote:
> Contrary to what some might want to believe OSM is not a software
> development project.

Correct. iD, however, is a software development project. It is one OSM
editor among many (or at least a few) and is not even the only one
available on the main site. Its current privileged position is because
some operator of the main site thought it was the best choice, not
because it was commissioned in any way to be *the* editor for OSM.  It
isn't OSM, it's one of the tools by which OSM data are maintained.

> And what you are communicating here, that mappers should adjust their
> mapping style for the convenience of data consumers, is highly
> problematic.  I know this is not a rare viewpoint, even among mappers
> submitting yourself to the (perceived) needs of data consumers
> (something i called "preemptive mapping for the renderer" some time
> ago) is widespread.
>
> But OpenStreetMap is a mapper centric project.  Developers around it are
> to serve the mappers, not the other way round.  Fortunately on a global
> level interests of mappers and interests of data users usually go hand
> in hand.  But if in specific cases they don't the interests of mappers
> have to overrule those of data users and developers - otherwise we can
> close down shop right away and reopen as Google Mapmaker 2.0.
>
> Or in other words:  "those who do the work make the rules" does apply,
> but in OSM those who do the work are - by a very large margin - the
> mappers.  And developers should use the influence they inevitably have
> to support the mappers in making competent and viable decisions - in
> their interest, not in that of the data users.

I'm puzzled by this answer, and hope that it's another case of
"violent agreement."

Here's where I'm not following your argument. I play most of the roles
at one point or other - doing hand-curated mapping (largely in my own
neighbourhood, or to add recreational facilities such as hiking
trails); doing imports (generally speaking, of public-access lands
wtih recreational value); developing map rendering (I produce
electronic and paper trail maps); some limited amount of data analysis
(e.g., route descriptions and distance tables); and of course using
maps for trip planning and navigation. I'm even willing to do some
software development to aid in these processes, although at present I
surely haven't time to dive into the development of the OSM
infrastructure.  I have personal experience with quite a lot of the
data flow, I'm a mapper, and a data consumer, and a user. I wouldn't
be a mapper if I weren't a data user (why would I bother?), and I'm a
data consumer only because some of my uses of the data go beyond what
the existing available processing chains support.

I'm not seeing the conflict of interest that you claim to see. Why is
a mapper entering data? Unless mapping is simply an activity of
recognition analogous to stamp collecting or train spotting, what can
the purpose be of entering all the data, except that the mapper can
use them, or enable others to use them? Using the data is being, or at
least invoking, a data consumer - so I'd presume that mappers care
very much about data consumers. Every one of the objects that I've
mapped is at least of a general class that I want to see on one sort
of map or another, or have available for planning, routing, or
navigation.

To that extent, it has to be at least possible in theory for detect
the sorts of objects that are on the map. A renderer, a router, a
navigation engine, any data analysis needs to discriminate among the
objects of interest and those that are not of interest - and sometimes
make broader connections, as when a distance table must not only list
points of interest, but measure how far they are from the route,
locate the point of departure from the route, and measure a distance
along the route from the start. (This is a complex process; even
identifying the ways that are on a given route, in order between two
points, is a fairly complex bit of code.)

To that extent, all mapping is 'tagging for the renderer'. It has to be.
If objects are not tagged with renderable features, they will not
be rendered. If a mapper wishes two objects to be treated differently
for rendering, routing, or other analysis, they must be tagged
differently in some way.  And if the tagging decisions are inconsistent
and contradictory, a renderer, router, or navigation engine will give
wrong answers. How could it be otherwise? Until the computer is
invented that can read the mappers' minds, the entered data are
all that a program has to guide it.

Now, if we're talking about rendering for the convenience of some
specific piece of rendering, routing, or analysis software, that makes
assumptions that don't match the mapper's view of the world, I agree
with you. If rendering or analysis software is holding us back from
rendering features that we care about, distinguishing features that we
wish to distinguish, or following a reasonably consistent data model,
then it is the consuming software that has to change.

Sometimes, those pieces of software, however, are part of the
environment and simply must be lived with. I render my own maps when I
truly care about the specific presentation of the data.  Nevertheless,
OSM-Carto is our public face, and I lobby for its developers to make
decisions that I think are sensible, because the data I enter and use
will appear there, and I would like the default map to be as useful to
me - or better, to my understanding of what newcomers would want - as
we can make it.

Sometimes, that's slow enough moving that interim tagging - not lies,
but rather imprecise descriptions - is warranted. The protected_area
kerfuffle went on for years (and is not quite resolved, although a
solution is on the horizon at long last), and a widespread consensus
emerged that 'leisure=nature_reserve' was an acceptable interim
solution until better tagging became available.  That's unquestionably
'tagging for the renderer', but where the alternative is 'OSM-Carto
not rendering large objects that hold considerable public interest,"
the imprecise tagging appeared to be the lesser evil.

Except for these rare exceptions, tagging for any specific data
consumer is indeed unwarranted. Nevertheless, there are aspects where
any conceivable data consumer will face the same problems.
Inconsistent and contradictory data will yield inconsistent and
contradictory interpretations.

If not everyone tags the same way, the developer of a data analysis
has more work accommodating the different tagging styles.  That's
mostly OK, but really ought to be avoided where possible.  Certainly,
having multiple tagging styles out of mere ignorance is horribly poor
practice.  If the different tagging styles reflect different local
views or a genuine failed consensus, well, it's something that data
consumers will most likely have to live with.  In any case, it's less
obnoxious than having the same tagging mean two different
things. Doing that is actively hostile to data consumers, because it
offers them no way to distinguish the objects that are in a class of
interest from ones that are not.

When I've asked tagging questions, it's generally been from the point
of view, "I have this set of objects. I wiish to render/analyze them
in a distinct way from this other set of objects. I'm willing to tag
them any way you like; I'm willing to adapt the queries I make to
conform with the tagging that is chosen."  On this list, asking with
that attitude turns out uniformly to be futile. I hear mostly from
contingents who say, "there's no difference between class A and class
B, so you shouldn't tag them differently?" (How, then, pray tell, can
I render them differently on renderings that I develop?)  "you should
invent a whole new type of object, and not expect the existing
renderer to notice it for years, if ever." (also not acceptable, when
I am perfectly happy to have class A and class B both be variants of
some object that is recognized by the existing renderer); "you need to
'boil the ocean' and come up with a complete schema of all the ways in
which such objects may differ, and tag only the details," (making any
tagging condional on having done work worthy of a master's thesis in
geography?) and similar comments.

In that sort of bogged-down discussion, which as I said, invariably
happens when I ask a question here, I find that the only answers that
offer coherent guidance are the ones that take the form, "I'm
interested in that sort of thing, too, and here's the tagging I
use/plan to use/imagine" In short, guidance from someone else who has
a wish to consume the data, and has their own ideas on how best to
structure the information. Those are the answers that I find
golden. They are from data consumers.

Nearly as good are the answers that go, "your rough-cut tagging scheme
is inconsistent with these other things that are already in the
database," or "your scheme will fail if an object has the following
attributes" (e.g., I need relations rather than mere tagging) - at
least they warn me of pitfalls. They are much more valuable if they
can offer a concrete suggestion about how to avoid the pitfalls. These
are the silver answers.

Answers advising me about "least worst" tagging for the existing
rendering chains are of bronze at best. They at least get something
shown on the popular renderings, but do nothing to inform me about how
to go beyond them. I imagine that is what you are thinking about when
I speak of taking the advice of data consumers. (I hope I've managed
to clarify!)

Answers whose summary is "you can't have that," are leaden.  They
merely increase the burden, without carrying any value.  But -
invariably - I see a lot of them.

My suspicion is that in this discussion, where I said "data consumers"
_sensu lato_, you interpreted the phrase _sensu stricto_, thinking I
was advocating tagging for the existing limitations of available data
consumers. I was more saying that the need to keep tagging unambiguous
is a requirement without which no data consumer can be satisfied, that
unnecessary proliferation of different tagging schemes imposes a
burder on all consumers, and that we ignore the needs of the consumers
at our peril.