[Tagging] Using multipolygons to map bays in Alaska

Thu Nov 15 21:11:19 UTC 2018

On Thu, Nov 15, 2018 at 3:02 PM Christoph Hormann <osm at imagico.de> wrote:

> > Even in that extreme example, having the spatial extent adds value.
>
> Data of subjective value for a specific application (like low quality
> label rendering) - yes, obviously.  Meaningful additional information
> about the verifiable geography - no, i don't think so.
>

By 'low quality', I presume you mean 'of a quality that can be achieved
algorithmically rather than by manual label placement by a skilled
cartographer?' Otherwise, what's your approach to higher quality?

> What you usually will want to start with is finding the closest point on
> the coastline.  You might not want to use the original coastline data
> but pre-process it to some extent to for example eliminate small
> isolated islands.
>
> If you just want to do a primitive importance rating based point label
> rendering like OSM-Carto you will then just take all coastlines within
> something like 3-5 times that distance and make a bay size assessment
> based on that - your choice how fancy you want to make this.  Simplest
> version is to use the distance right away but you can easily make this
> a bit more robust.
>
> If you actually want to place a label dynamically procedure will depend
> a lot on the style of label you want to use - horizontal single line,
> multi-line, rotated, curved - font size scaled or characters spaced
> according to the extent of the label.  This part can be somewhat akward
> and inefficient because common spatial database systems are not
> specifically designed for this kind of task. What you need to do is
> essentially to 'probe' the coastline environment and determine the
> extent of the bay and where the desired label best fits in there.
>

I can see how that approach might sort of work in some cases, but it
strikes me as rather brittle.  A node anywhere in the middle of the Sea of
Cortez or the Gulf of Aqaba will be close enough to the nearest shoreline
that measuring off 3-5 times the distance will still not span the length of
the waterbody, meaning that identification of it as an area feature will
still be less than what we'd want.  A large-scale map of Eilat  would still
have a really difficult time - even considering the larger context -
identifying the name of the waterbody off its coast. An area like the
Jamaica Bay example I gave earlier, with many islands and channels in a
tidal wetland, would also be extremely difficult to reason about in that
manner.

It is only once the spatial extent is determined that a renderer can do a
good job of label placement.  Of course, what the renderer does with that
information is highly dependent on the style of label to be used, but any
rendering that's at all more sophisticated than OSM/Carto's 'place a
single- or multi-line upright label on a point' needs the information.
You've given a very clever 'second best' approach to determining that
information when only a point is available - and I'm likely to find myself
using it because of the current state of the map data, so thank you for
that.

Nevertheless, I think it would be much preferable to allow mappers to
communicate their intent. When mappers add a bay, inlet, gulf, fjörd, ...
to the map, they can be presumed to know what extent they would like the
object to have. What harm does it do if we give them a way to describe that
knowledge to others? Why the insistence on restricting the data model so
that we must use a brittle reconstruction technique rather than allowing
mappers to enter the extent of the object and data consumers to see it?

The two arguments that I hear so far appear to amount to:

- if any portion of an object's boundary is spatially indefinite, then that
object may not be represented as an area.

Farewell to several counties and townships in the northern part of my
state, then.

- carrying the data for bays is not scalable.

In that case, we need to open a much broader discussion of general
categories of data that we need to exclude from OSM in order to manage the
size of the data base or the complexity of the computations. I surely don't
want to start seeing conflicts between better- and worse-mapped regions
over perceived inequities in the allocation of server resources! If instead
of data volume, the concern is the complexity of processing large objects,
then it would perhaps be better addressed by a rule that "no area feature
should exceed more than X km², have a mapped boundary length of y km, or
require more than z nodes for its representation" - and then work out how
we want to handle exceptions like countries, large islands, and large lakes
and seas. (That's then the right time to discuss what to do about bays,
straits, estuaries, and similar features.) The argument would also have to
distinguish between the cost of maintaining the data on the server - the
real OSM - and the cost of processing the data in the OSM-Carto rendering
chain - OSM's public face. If there's a way to have the information in the
actual OSM database but not in the main renderer, or have the pipeline
generalize it to a lower-cost but less informative form, that would be
better than discarding it entirely.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/tagging/attachments/20181115/9a23c8bb/attachment-0001.html>