[Tagging] Feature Proposal - RFC - Cluster

Friedrich Volkmann bsd at volki.at
Fri Jan 16 13:06:17 UTC 2015


On 16.01.2015 10:10, Lukas Sommer wrote:
> What you propose is an algorithm that does a sort of “guess”. For
> doing some sort of guess, we don’t need to introduce a new relation.
> That could be done also without a relation.

Look at the examples. You cannot represent the data without a relation.

There may still be some guessing involved in the interpretion of the data,
but it's less guessing than without the relation.

> Introducing a new relation should lead to better data and more
> well-structured information. There should be a certain gain in
> information. This will work only if the relation proposal is really
> clear. If not, probably it will happen the same things as with the
> “site” relation (=there are “site” relations in the database, but
> little software support for this).

The lack of software support for site relations has 2 reasons:
1) chaotic definition
2) lazy renderer developers

We certainly cannot influence (2), but I tried to make the proposal as clear
and straightforward as possible, thereby improving on (1).

> I think that Никита has touched a very important question: Is there
> inheritence? Means: Are tags on the relation also considered tags of
> the members?

No!

> And how to deal with conflicts? This is not as trivial as
> it sounds. I’ll take some of your examples:
> 
> – “name=Schwedenhöhlen” + “natural=cave_entrance” on the relation.
> “name=Schwedenhöhle 1…” + “natural=cave_entrance” on the nodes.

That's a tagging error. It's like a church tagged highway=motorway. There's
no point in pondering over that.

> – “name=Schwedenhöhlen” on the relation. “name=Schwedenhöhle 1…” +
> “natural=cave_entrance” on the nodes. You would have to go to the
> nodes, interpretate them and use the same rendering style also for the
> relation. Probably also not as trivial as it sounds. What, if not all
> nodes does not have the same tag (example: a natural=cliff within this
> group). Is this considered as error? Does this prevent the rendering?
> Your proposal was was count: More natural=cave_entrance occurences
> than natural=cliff occurences would make the relation render as a
> natural=cave_entrance. For me, this does not look like an improvement
> of the current data quality, but rather like a degradation of the data
> quality.

When you dig into interpretations and algorithms, things start looking
complex. This is not specific to type=cluster relations. Take a simple
building=yes. It becomes more complex as soon as it is accompanied by other
tags. It becomes even more complex when you think about label placement and
collision. Not to mention generalisation (e.g. simplification at lower zoom
levels). You could end up running out screaming just because of a simple
building=yes tag.

I suggest we make a clear distinction between tagging (i.e. the formal
representation of data) and how to use these tags in applications (i.e.
algorithms, rendering rules, data mining etc.). There's mapping (data
providers) on one side, and applications (data consumers) on the other side.
OSM data and wiki definitions serve as the interface. That interface is what
the tagging mailing list is all about. Our goal here is to work on the
interface. We don't need to dig to much into mapping processes or
application internals. We just need to provide an interface that all can
work with. My sample algorithms just prove that it is possible to work with
the interface. It's up to develpers to actually use those algorithms, or
other algorithms, or no algorithms at all.

So when you just look at the proposal and forget about algorithms, the
proposal provides a fairly clear-cut interface. Tags on the relation refer
to the group as a whole, and tags on the members refer to the individual
members. It could not be more simple.

> At least, I would suggest to treat such cases as “invalid”
> relations which should be ignored by data consumers.

This restriction would not only complicate things a lot, it would also make
almost all uses "invalid". Let's stay with the Schwedenhöhlen example. The
individual Schwedenhöhlen have different cave:ref=* and different cave
lengths. That would already make the group invalid.

> –  “name=Schönefelder Seen” + “natural=water” on the relation. No tags
> at the member areas.

Tagging error. You may do this with a multipolygon relation, but not with a
cluster relation.

> –  “name=Schönefelder Seen”  on the relation. “natural=water” at the
> member areas. Members propagate their tags to the relation. So a
> renderer can determine the color for rendering the name of teh
> relation. Sounds easy for the “Schönefelder Seen”. But this solution
> would create conflicts when applied to the caves, because the cave
> relation members have also individual names, and these names would
> conflict with the relation name.

This is a normal label placement conflict. Nothing to worry about. Renderers
handle that billions of times.

> I understand that you want to have a simple and generic solution. But
> I doubt that this will create any benefit in the real practice. And I
> fear – if it is as generic as you propse – it will end up like the
> “site” relation.

I am not too optimistic about upcoming renderer support, but having the
features correctly mapped in the database is a big step forward, and maybe
support in search engines and topographical maps (where the relation is most
useful) will follow when they notice it's useful.

> Your original idea was to give to give a common name to various
> objects. So, logically, the relation could be limited for usage
> together with the tag “name”: Something like type=common_name for
> relations, together with name=* on the relation.

I don't want to narrow it down to the name. A cluster can also have a
cave:ref or a protection status or suchlike. The proposal page also
incorporates an example with a common address.

Of course, a relation just for the name would spare application developers
from thinking about other tags, but at the expense that we still cannot
represent all real-world data and therefore still need a type=cluster
proposal later on. The result is that we'll have two relation types instead
of one.

I feel that mappers care too much whether renderer developers will mind
implementing something. We use to serve them data on the silver platter to
please them, with the result that they get used to the silver platter and
cease working on algorithms. The only way to get out of this is to throw
away the silver platter so they need to come back to normal life and use the
raw data.

> But use it only together with special tags like natural=group_of_lakes
> – these special tags needs to be introduced also. type=cluster could
> also be used together with the existing place=archipelago. So you
> would have to make an own tagging decription for each feature (group
> of lakes, group of cliffs …) and this leads to a clear documentation
> and clear rules.

Martin already suggested this, but the number of new tags would increase
indefinitely. This would become evil and unmaintainable in the long term.

-- 
Friedrich K. Volkmann       http://www.volki.at/
Adr.: Davidgasse 76-80/14/10, 1100 Wien, Austria



More information about the Tagging mailing list