[Tagging] Can OSM become a geospacial database?

Mon Dec 10 02:01:57 UTC 2018

> Can OSM become a geospatial database?
>
>  It currently fits almost any definition of 'GeoSpatial' database. Even if
you ignore any intrinsic properties you might select to define
'GeoSpatial' database, extrinsic properties would define it as such, for
example the UN-HCR, the U.S. National Geospatial Agency, The U.S. National
Park Service, and probably thousands of others use it to perform C.R.U.D.
<https://en.wikipedia.org/wiki/Create,_read,_update_and_delete#Database_applications>
operations on a continuous basis.

That being said, from a software development perspective, it perhaps more
resembles a set of loosely federated database system
<https://en.wikipedia.org/wiki/Federated_database_system#Heterogeneity>. So
the technical approaches are not as straightforward as an ordinary
database, one probably should treat it as a data lake
<https://en.wikipedia.org/wiki/Data_lake> or a nascent data warehouse
<https://en.wikipedia.org/wiki/Data_warehouse#History>  - if one were
unkind, sometimes it can seem like a data swamp
<https://dl.acm.org/citation.cfm?id=3209911>. In practice, this means a
chain of ETL <https://en.wikipedia.org/wiki/Extract,_transform,_load>
operations, rather than a single straight forward database query. And what
makes this even weirder is that, in some ways, OSM is a hybrid of a
relational <https://en.wikipedia.org/wiki/Relational_database> and a graph
<https://en.wikipedia.org/wiki/Graph_database#Labeled-Property_Graph>
database.

> Right now OSM is a collection of dots and lines with some generic tags for
> rendering them on a map. They do compile into nice maps but does it really
> work when it comes to searching for objects of real life categories? ...
>
> Superficially, that seems the case, but only because of expectations.
expanding the perspective, IMHO, it is actually fairly robust and
sophisticated considering what it is required to do. It actually permits
use cases which would be intolerable for mundane systems.

> To wrap it up it is hard to impossible to get objects of some real live
> category from OSM database in order for example to highlight them on a
> map or to list them in search results.
>
> I would agree that it is hard, but not impossible. Certainly in a single
step for the entire data space. In the 'stream' example, one has to work
across the basic data type elements
<https://wiki.openstreetmap.org/wiki/Elements> of nodes, ways, and
relations, then across keys <https://taginfo.openstreetmap.org/keys>, tags
<https://taginfo.openstreetmap.org/tags>, and relation types
<https://taginfo.openstreetmap.org/relations>. And even within those, there
are wildly different purposes, like base geometric meanings like
multipolygon <https://taginfo.openstreetmap.org/relations/multipolygon>
alongside high level abstractions like surveillance
<https://taginfo.openstreetmap.org/relations/surveillance>. So, if one were
building some sort of generic software utility, one has to inventory the
relative prevalence of the structures above, and bound the problem
accordingly along with leveraging aspects like the geometric bounding box.
Once you get down in the weeds
<https://dictionary.cambridge.org/us/dictionary/english/in-the-weeds>, like
with 'amenity', you are in the NLP
<https://en.wikipedia.org/wiki/Natural_language_processing> realm, and
would have to supplement from an external utility like WordNet
<https://en.wikipedia.org/wiki/WordNet> - for example using synsets
<https://www.geeksforgeeks.org/get-synonymsantonyms-nltk-wordnet-python/>
and semantic distance <https://en.wikipedia.org/wiki/Semantic_similarity>.
... see  OpenStreetMap Semantic Network
<https://wiki.openstreetmap.org/wiki/OSM_Semantic_Network> .

There are two workarounds used right now. The first one is to bind some new
> tags to local categories ... The second one is to put category name into "name" tag, e.g. "Liberty
> avenue", "Blue lake", "South park". ...
> I invision the following solution here.
> * First of all, the "name" tag should containt proper name only.
>
> I agree, but people are people, and for ordinary people, if you ask three
people to name something, you'd get three different 'names' at different
levels of abstraction
<https://rationalwiki.org/wiki/Prototype_theory#Basic_level_categories> (
subordinate, basic, or superordinate). Point and ask three people "What's
that" and you'll get "The Columbia River", "North Channel", or " Knappa,
Knappa Slough", so even the proper names will vary.

> * Secondly, introduce a new tag for the real life language specific
> category name. I know that "name:prefix/postfix" key was originally
> introduced for another purpose but it can be a candidate here as well. Note
> that in some languages the place of category name relative to the proper
> name matters.
>
>  Because of the complexities noted previously, the weight of legacy
information, and maintenance complexity ( occasional refactoring ), a more
or less parallel scheme would be unrealistic inside of OSM. Possibly one
of the OSM semantic projects
<https://wiki.openstreetmap.org/wiki/Category:Semantics> might provide
similar capability. Implementing as you describe would be the Mother of All
Automated Edits.

> * Thirdly, in order to make the life of renderers simple, introduce one
> more tag for holding the name which can be displayed on maps as is without
> any modifications, e.g. "display_name". This tag may contain whatever
> content is considered locally appropriate specifically for rendering on
> maps.
>
> I'm not sure I understand this, but superficially it seems to break the
convention of separation of data and symbolization (heavily dependent on
the specifics at the endpoint).

There are people in the community that are *far* more knowledgeable than me
on these themes, I suggest you reach out to them.

For me, a  mental model of monolithic OSM isn't useful. It isn't unique to
OSM, even what appear to be simple concepts like Employee Name in an
enterprise database become very complex when applied to different cultures
- I one reconciled a record for a nurse who had 13+ different versions, all
perfectly 'legal' in the corporate records.

Michael Patrick
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/tagging/attachments/20181209/e17223b4/attachment-0001.html>