[OSM-dev] tagtransform for OSM - An effort make tagging and using OSM data easier; bridging different worlds together

Imre Samu pella.samu at gmail.com
Sat Dec 7 00:56:45 UTC 2019


> There are some available options I see: ....
> What do you prefer or do you have another options to add

IMHO:
it is a hard problem ..    ( we need more use cases ...  for find the
global optimum )
on the other hand -  I am interested in your Proof of Concept solutions.

Probably we need a "Metadata Working Group" to collect this type of
informations,
and store the information  in the "central metadata repository"(?) .. (
with a good license )

related - with this discussions:
https://www.openstreetmap.org/user/SomeoneElse/diary/391484



Sharing some use cases :

*iD Editor - tag-transformation metadata *

The iD editor has a lot of metadata , probably we can analyze ..
the tag transformation metadata is very simple, and it has *~302
*transformation
rules.
Data license :  ISC

example:

{   "old": {"aerialway": "canopy"},
    "replace": {"aerialway": "zip_line"}
},
{   "old": {"aeroway": "aerobridge"},
    "replace": {"aeroway": "jet_bridge", "highway": "corridor"}
},
{ "old": {"access": "public"},
  "replace": {"access": "yes"}
},

or a little more complex:

{ "old": {"building:type": "*"},
    "replace": {"building": "$1"}
},

LINK  https://github.com/openstreetmap/iD/blob/master/data/deprecated.json
Deprecated tags OSM wiki :
https://wiki.openstreetmap.org/wiki/Deprecated_features


*iD Editor - Discarded tags*
https://github.com/openstreetmap/iD/blob/master/data/discarded.json  ( ~ 46
tags )
see other discardable tags (JOSM,Potlach, )
https://wiki.openstreetmap.org/wiki/Discardable_tags


*Imposm3 has a special tag transformation types*

tags exclude: exclude: [created_by, source, "tiger:*"]
and has some special "data mapping"[1] rules:
- bool / direction / enumerate /categorize / ....
https://imposm.org/docs/imposm3/latest/mapping.html#column-types
[1] https://en.wikipedia.org/wiki/Data_mapping

and I would like to add the"COALESCE()" transformations to the
transformation rules..
example:
   "name" =  coalesce("name","name:en","name:de","name:fr")


*giggls/mapnik-german-l10n repo has a streetname abbreviation rules*
https://github.com/giggls/mapnik-german-l10n/blob/master/plpgsql/street_abbrv.sql
example:
  abbrev=regexp_replace(abbrev,'(?<!^([0-9]+([èe]?r)?e )?)Avenue\M','Ave.');
  abbrev=regexp_replace(abbrev,'(?!^)Boulevard\M','Blvd.');
  abbrev=regexp_replace(abbrev,'Crescent\M','Cres.');
  abbrev=regexp_replace(abbrev,'Court\M','Ct');
  abbrev=regexp_replace(abbrev,'Drive\M','Dr.');
  abbrev=regexp_replace(abbrev,'Lane\M','Ln.');


*QA Tools*
And we can collaborate with QA tools ..  for validating "values"
https://github.com/osm-fr/osmose-backend/blob/master/plugins/Website.py
like ...  URL type values ..  If not valid RFC URL ..  we can safely drop ..






*"contact:webcam""contact:website""facebook""url""website:mobile"
"website:stock""website"*
other QA tools:  https://wiki.openstreetmap.org/wiki/Quality_assurance

on the other hand - the validation is hard
 (
https://www.mjt.me.uk/posts/falsehoods-programmers-believe-about-addresses/
   https://wiesmann.codiferes.net/wordpress/?p=15187&lang=en
)


*OSRM ( routing) LUA transformations*
-
https://github.com/Project-OSRM/osrm-backend/blob/master/profiles/bicycle.lua
- https://github.com/Project-OSRM/osrm-backend/blob/master/profiles/car.lua
- https://github.com/Project-OSRM/osrm-backend/blob/master/profiles/foot.lua

*PostgreSQL*
Sometimes the tag transformation is *in SQL*
https://github.com/gravitystorm/openstreetmap-carto/blob/master/project.mml
for example - the oneway is not so simple:
           CASE
              WHEN oneway IN ('yes', '-1') THEN oneway
              WHEN junction IN ('roundabout') AND (oneway IS NULL OR NOT
oneway IN ('no', 'reversible')) THEN 'yes'
              ELSE NULL
            END AS oneway,

or  size of the telescope:

CASE
  WHEN man_made IN ('telescope') THEN
  CASE
    WHEN tags->'telescope:diameter' ~ '^-?\d{1,4}(\.\d+)?$' THEN
(tags->'telescope:diameter')::NUMERIC
    ELSE NULL
  END
 ELSE NULL
END AS "telescope:diameter",

if we can move this type of transformation to the osm2pgsql  side ..   the
rendering will be faster


We need some *Unicode cleaning rules*
Like my favorite problems with different apostrophes :
https://www.openstreetmap.org/user/ImreSamu/diary/34905

There are some "trap" tags .. detecting bad imports  ( latitude, longitude,
lat, lon  )
- https://taginfo.openstreetmap.org/search?q=latitude
- https://taginfo.openstreetmap.org/keys/LAT
We can easily drop this tags ..   or create a QA list  for the local
community
And there are some problematic keys (
https://taginfo.openstreetmap.org/reports/characters_in_keys )


*Other*  tag transformation problems :
-   converting to metrics ( "50 mph" -> km/h  ; knots .. )   (
https://wiki.openstreetmap.org/wiki/Key:maxspeed#Parser )
-   decoding special speed codes "FR:walk"/"DE:urban" .. (
https://wiki.openstreetmap.org/wiki/OSM_tags_for_routing/Maxspeed )
-   injecting 3rd party data  ( for example Wikidata labels , driving side
, country admin code  )
-   validating 3rd party links ( wikipedia links  ,  wikidata id-s; website
urls , facebook links  )
-   validating osm "values" from the lookup tables / regexp   (  example:
"building:color" )
-   removing nested relations ( role=subarea)
https://github.com/osmcode/osmium-tool/issues/169


For an *ad-hoc transformation* ( osm.pbf -> osm.pbf ) we can use
- https://osmcode.org/opl-file-format/
- osm xml format
   osmium cat  input.osm.pbf -f xml --no-progress  -o -  |   sed ...  |
osmium cat - --input-format xml  -o output.osm.pbf
- https://osmcode.org/pyosmium/  ( python )
- https://github.com/osmcode/node-osmium ( Javascript )
- ( osm.pbf ) Golang libs  ...
- ....

Regards,
 Imre


Sören Reinecke <tilmanreinecke at yahoo.de> ezt írta (időpont: 2019. dec. 6.,
P, 13:33):

> There are some available options I see:
> a) Not working on this further.
> b) Using `tagtransform for OSM` to create an own transformation
> specification.
> c) Writing converters which convert from a format to the format of
> `tagtransform for OSM` and writing converters to convert from the format of
> `tagtransform for OSM` to another format programs can work with. Using my
> specification which needs to be extended to create conpactibility among
> different formats while ensuring that my spec can be used on its own.
>
> What do you prefer or do you have another options to add?
>
> Cheers
>
> Sören Reinecke alias Valor Naram
>
>
> -------- Original Message --------
> Subject: Re: [OSM-dev] tagtransform for OSM - A effort make tagging and
> using OSM data easier; bridging different worlds together
> From: Imre Samu
> To: Sören Reinecke
> CC: OSM-Dev Openstreetmap
>
>
> > I currently write a specification for tranforming tags in OpenStreetMap
> to make life of data customers easier.
>
> imho:  we can import some good ideas from
> https://wiki.openstreetmap.org/wiki/Osmosis/TagTransform schema ..
> *"The tag transform Osmosis plugin allows arbitrary tag transforms to be
> applied to OSM data as a preprocessing step before using other tools. This
> allows other tools to concentrate on doing what ever they do, without
> having to handle numerous different tagging schemes and error corrections."*
> imho:   regexp is useful.
>
> probably the "lua" is good glue/meta language - for writing "business
> rules".
> some examples:
> Valhalla (routing)  admin.lua (
> https://github.com/valhalla/valhalla/blob/master/lua/admin.lua )
> Valhalla (routing) graph.lua (
> https://github.com/valhalla/valhalla/blob/master/lua/graph.lua )
>
> osm2gsql - openstreetmap-carto.lua
>
> https://github.com/gravitystorm/openstreetmap-carto/blob/master/openstreetmap-carto.lua
>
> Regards,
>  Imre
>
>
>
>
>
>
>
> Sören Reinecke via dev <dev at openstreetmap.org> ezt írta (időpont: 2019.
> dec. 5., Cs, 15:59):
>
>> Hey all,
>>
>> I currently write a specification for tranforming tags in OpenStreetMap
>> to make life of data customers easier. Different tagging schemes have
>> emerged since the existence of OpenStreetMap, same are existing in parallel
>> and a newer one deprecated an old one. Data customers without knowing the
>> OSM community much get lost. This project aims to help developers who want
>> to take advantage of the OpenStreetMap great database which is by the way a
>> brilliant project. This project can also help to make tagging in OSM more
>> orthogonal and more hassle free.
>>
>> I saw conflicting interests between OSM community, OSM developers like
>> the iD developers and data customers. A renderer might need data in another
>> way as the community contributes. The community might need another tagging
>> scheme than a renderer. I thought how we can resolve this, how we can get
>> all sites on "one table" and that is the idea I had come up with:
>>
>> A more readable version can be found here:
>> https://github.com/ValorNaram/transformation-table-osmtags/blob/master/README.md
>> and the principles can be found at
>> https://github.com/ValorNaram/transformation-table-osmtags/blob/master/principles.md
>>
>>
>> ------------------------------
>>
>> Example 1: They want to have the phone number of a POI. There are some
>> problems with this:
>>
>>     1. They need to know both contact:phone and phone to get them all.
>>     2. They need to support them both.
>>     3. They need to remove one in case both keys are mapped on one POI.
>> This really happens, see http://overpass-turbo.eu/s/OI2.
>>
>> Example 2: They want to know how many POI's have changing tables
>> (general: facilities for changing a nappy of a baby). There are some
>> problems with this too:
>>
>>     1. They need to know both changing_table and the deprecated diaper to
>> get them all.
>>     2. They need to support them both. Difficult because they're highly
>> different tagging schemes.
>>     3. They need to remove one in case both keys are mapped on one POI.
>> This really happens, see http://overpass-turbo.eu/s/OI5.
>>
>> Example 3: They want to develop a mapping tool and want to correct wrong
>> typed in tags. There are some problems with that:
>>
>>     1. They need to know all possible tagging schemes existing for one
>> purpose (e.g. mapping facilities for changing the nappy of a baby).
>>     2. They need to know the right/approved/more logical scheme.
>>     3. They need to know how to convert:
>>
>>      diaper=yes
>>      diaper:female=yes
>>
>> becomes after the transformation:
>>
>> changing_table=yes
>> changing_table:location=female_toilet
>>
>>
>> ------------------------------
>>
>> Basically it bridges the OSM community and developers together, it might
>> even resolve the conflict between iD developers and the community.
>>
>> The project bridges different worlds and is therefore a bridge. As bridge
>> this project should not just connect different worlds together and by
>> ensuring peace between those but also support exchange between those to
>> develop a social economy of  "send and receive" This project should support
>> the "come together" of (OSM) developers and mappers.
>>
>>
>> *I want to hear your opinions on this and also feel not to shame to ask
>> questions. My answers will be part of the clarification of the project
>> "tagtransform for OSM".*
>>
>> Cheers
>>
>> Sören Reinecke alias ValorNaram
>>
>> _______________________________________________
>> dev mailing list
>> dev at openstreetmap.org
>> https://lists.openstreetmap.org/listinfo/dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20191207/6a356fd5/attachment-0001.html>


More information about the dev mailing list