[Tagging] Fuzzy areas again: should we have them or not?
Anders Torger
anders at torger.se
Wed Dec 23 23:20:31 UTC 2020
I've been thinking more about these issues and I think what we need is a
bit more pragmatic interpretation of the verifiability principle. What
is the verifiability principle? It's that we should only add data that
can be made independently by two different mappers using different data
sources and still come to the same result. The reason we have that is
due to that we want OSM to be egalitarian cooperation and that we should
not need to have a policing entity. We're afraid that fuzzy areas, or
"non-verifiable geometry" would lead to edit wars, and we would need
policing to stifle that, and then we have broken the egalitarian
cooperation idea.
This means that large bays with unclear borders like Vestfjorden outside
Norway should be mapped as a point somewhere in the middle of it (where
it's verifiable that it's clearly inside), and not as a polygon showing
the approximate span of it. If we start to dive deeper into this we
discover that what is verifiable and not is not that clear-cut though.
If we keep to bays and straits, what about all those bays that have a
narrow opening to the surrounding sea/lake? It's pretty well-defined
where the end of that is even if it hasn't been documented anywhere
ever, so maybe that can be considered verifiable geometry. However as
bays open up and get wider and wider and less defined, the polygon would
differ more and more between different mappers, and we enter the range
what we traditionally call unverifiable geometry. Vestfjorden is an
extreme example as it's almost like open sea. Still no edit wars on that
though.
Then we have things like Skagerak Sea between Sweden, Norway and
Denmark, it has some form of defined borders as it ends at defined
villages and an island group. However the villages do span some area,
and also the island group. I just had a look at the polygon. I can see
that the mapper know about this border definition but I would place the
corners somewhat differently if I had made it, but I wouldn't say that
the polygon is wrong so I leave it untouched. I'd say that this is
verifiable geometry even with a quite strict interpretation of
verifiability, although the border can be put in a 3 km wide range or
so. Not sure everyone would agree though.
A main reasons why you would want to map these features as a polygon
rather than staying on the safe side and map them all as a point is that
1) it's more satisfying to map complete information (and showing the
extent is more complete) and 2) a point won't work for either
cartography or queries about these features. If we map Skagerak as a
point it would only become a tiny label in the sea, totally useless for
a map. Not only in OSM-Carto, but for any map renderer, as there is no
size information attached to that point. Cartography would do quite well
with a size-specification added to the point, but queries need a
polygon, otherwise it will not be possible to ask questions like if
"Kungshamn in Sweden is at Skagerak" or not. I don't think we have any
system that can answer these questions anyway today, but if such would
be developed in the future it needs that data. And as OSM is about
geodata it seems like a most relevant goal, if it just can be fulfilled
within the scope.
Like it or not, many mappers care about mapping all they know, and
getting good maps as end results, and this is a strong motivator to
include also what many consider non-verifiable geometry. Put in another
way, if we force all these features to be only mapped as a point we
quite significantly reduce the scope of what OSM can provide for map
makers. Some believe this to be a small issue, others think it's larger.
I belong the latter group. Even if we think it's a quite small issue,
that should be weighed against what happens if we just relax a bit, opt
for a pragmatic interpretation and let things evolve along the diverse
paths mappers take.
So what about the edit wars, the policing and the egalitarian
cooperation?
I think that the a strict interpretation of the verifiability principle
and its rationale is not that believable any longer. Why? There is
already plenty non-verifiable geometry in the database, OSM-Carto
already renders and thus rewards some of them (like bays and straits),
the edit war danger seems greatly exaggerated (often looking at not-yet
existing extreme examples rather than the several thousands of already
existing examples that haven't caused any edit wars), and policing
already exists for other reasons out of necessity (vandalism and other
malicious edits). There is also clearly already social hierarchies with
some clearly very influential and powerful members, there is no such
thing as a completely egalitarian community. You have tremendous power
if you are a lead developer in some of the OSM software projects, or if
you are in control of what gets done or not in the most popular
renderers. Even if we look only at the group that is mappers and is not
involved in development and delivery, there is already a social
hierarchies formed in the local communities, and the local communities
are important to maintain order and protect from vandalism. It's not a
chaotic free for all, even if it looks like it on the surface.
In that context it seems unwise to push for a strict interpretation that
outlaws what many mappers already do a lot of, and something that adds
real value to the OSM data.
My view is that these "unverifiable geometries" actually are verifiable
if you just have a bit more pragmatic interpretation. These geometries
are verifiable that they exist, the rough extent is verifiable, and it's
verifiable which borders that are defined and which that are a bit more
loosely defined, in all the properties of the feature is verifiable and
can be represented. We could choose to have a "fuzzy" tag or similar,
but I think it's enough to tie it to the type of object it is. Anyone
understands based on how a polygon is drawn and what type it is if it's
exact or not. It's not a problem if a polygon will vary a bit mapper to
mapper. It's only a problem if it becomes an edit war, and those would
only occur in rather specific and rare cases.
So should we just throw verifiability under the bus then? No. These type
of geometries is still something any mapper should handle with care. The
key question to ask before you draw one is "how likely is it that there
will be an edit war on this feature?". If it's highly unlikely, just go
ahead and draw it. If it's likely or you are not sure, at least discuss
it with your community first.
/Anders
On 2020-12-23 18:23, Martin Søndergaard wrote:
> While some might not agree with the tone of Anders, I do think his
> "enthusiasm" has resulted in the most interesting discussion I have
> seen on this list yet. And I want to give a few of my thoughts as well.
>
> I think the discussion so far has been too focused on "does OSM need
> fuzzy areas?" while the reality is that the OSM database is already
> filled with fuzzy data; both areas and nodes. And here I don't mean
> "fuzzy" in the sense of "everything we map has some inherent error"; I
> mean real fuzzy data.
>
> First, we have the obvious ones:
>
> * natural=bay with ~60,000 entries
> * natural=strait with ~4,000 entries
> * natural=reef with ~27,000 entries
> * natural=glacier with ~56,000 entries
> * place=archipelago with ~1,300 entries
> * place=sea and place=ocean with ~150 entries
>
> All of these are "fuzzy" features which have no verifiable exact
> border, and currently they just exist in the database with no
> indication that they are in fact "fuzzy" features.
> Often these features are also added as nodes instead of areas (probably
> because the exact area is impossible to define).
>
> On Tue, 22 Dec 2020 at 09:43, stevea <steveaOSM at softworkers.com> wrote:
>
>> "Names in nature" is an interesting, complex, challenging, yes, even
>> strategic topic. I think we can get closer to "better," here on this
>> list, with good, respectful, effective dialog. I look forward to
>> that.
>
> In my opinion this problem is in no way limited to "names in nature".
> Practically all place=* features (except the "Administratively declared
> places" category), such as City, Town, Village, Hamlet, etc. are
> "fuzzy" features, but no one seems to talk about them as such. These
> places are either defined as:
>
> * An area: This is especially common with smaller settlements where the
> place=village or place=hamlet is just attached to some residential
> landuse area. But in every case where I have seen this type of tagging
> the resulting data is flatout incorrect. Suddenly the small park or
> commercial area in the center of the village isn't actually a part of
> the village; at least according to the tagging in the database. Or if
> it is a small hamlet with spread out houses (e.g. with small areas of
> farmland or meadows in between) I have several times seen the
> residential landuse being abused by connecting all the houses with a
> thin strip of landuse along a road.
> * A node: Here some person just defines an arbitrary point as the
> "center" of the village or town or city. Often the point is just placed
> where it will look the best on the map, i.e. "tagging for the
> renderer''. But even if you try to place it in the most correct center
> of the city, which center is that? Should it be the square in front of
> the town hall, the oldest part of the city, the infrastructure center
> of the city such as a central train station (this one might make a lot
> of sense for certain routing applications, but for many people it will
> just be strange). There is no correct answer.
>
> These features are by definition "fuzzy". I am not saying it is easy to
> define even a fuzzy area for a large city, but right now Copenhagen is,
> according to OSM data, just a single point placed arbitrarily in the
> front garden of a Copenhagen University building. Why? Because it
> results in a good place for the label on a map.
>
> People keep mentioning the ideals of "Map what's on the ground" and
> "Every feature has to be exact and verifiable", but combined with the
> reality of tons of "fuzzy" data already existing in the database the
> result is a kind of "false accuracy". The natural=bay or place=city or
> place=locality feature probably isn't located exactly where OSM says it
> is, and it is likely not limited to the exact location of the single
> node on which it is defined. But currently there is no structured way
> of knowing this.
>
> You can either make a new fuzzy=yes tag, or simply specify that
> natural=achipelago or place=town will always be fuzzy tags and editors
> should warn users not to make fuzzy areas too complicated or connect
> them with "exact" features such as landuse areas or roads.
>
> /Martin Søndergaard
> _______________________________________________
> Tagging mailing list
> Tagging at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/tagging
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/tagging/attachments/20201224/8860fa1c/attachment.htm>
More information about the Tagging
mailing list