[OSM-dev] tagtransform for OSM - An effort make tagging and using OSM data easier; bridging different worlds together
Imre Samu
pella.samu at gmail.com
Sat Dec 7 00:56:45 UTC 2019
> There are some available options I see: ....
> What do you prefer or do you have another options to add
IMHO:
it is a hard problem .. ( we need more use cases ... for find the
global optimum )
on the other hand - I am interested in your Proof of Concept solutions.
Probably we need a "Metadata Working Group" to collect this type of
informations,
and store the information in the "central metadata repository"(?) .. (
with a good license )
related - with this discussions:
https://www.openstreetmap.org/user/SomeoneElse/diary/391484
Sharing some use cases :
*iD Editor - tag-transformation metadata *
The iD editor has a lot of metadata , probably we can analyze ..
the tag transformation metadata is very simple, and it has *~302
*transformation
rules.
Data license : ISC
example:
{ "old": {"aerialway": "canopy"},
"replace": {"aerialway": "zip_line"}
},
{ "old": {"aeroway": "aerobridge"},
"replace": {"aeroway": "jet_bridge", "highway": "corridor"}
},
{ "old": {"access": "public"},
"replace": {"access": "yes"}
},
or a little more complex:
{ "old": {"building:type": "*"},
"replace": {"building": "$1"}
},
LINK https://github.com/openstreetmap/iD/blob/master/data/deprecated.json
Deprecated tags OSM wiki :
https://wiki.openstreetmap.org/wiki/Deprecated_features
*iD Editor - Discarded tags*
https://github.com/openstreetmap/iD/blob/master/data/discarded.json ( ~ 46
tags )
see other discardable tags (JOSM,Potlach, )
https://wiki.openstreetmap.org/wiki/Discardable_tags
*Imposm3 has a special tag transformation types*
tags exclude: exclude: [created_by, source, "tiger:*"]
and has some special "data mapping"[1] rules:
- bool / direction / enumerate /categorize / ....
https://imposm.org/docs/imposm3/latest/mapping.html#column-types
[1] https://en.wikipedia.org/wiki/Data_mapping
and I would like to add the"COALESCE()" transformations to the
transformation rules..
example:
"name" = coalesce("name","name:en","name:de","name:fr")
*giggls/mapnik-german-l10n repo has a streetname abbreviation rules*
https://github.com/giggls/mapnik-german-l10n/blob/master/plpgsql/street_abbrv.sql
example:
abbrev=regexp_replace(abbrev,'(?<!^([0-9]+([èe]?r)?e )?)Avenue\M','Ave.');
abbrev=regexp_replace(abbrev,'(?!^)Boulevard\M','Blvd.');
abbrev=regexp_replace(abbrev,'Crescent\M','Cres.');
abbrev=regexp_replace(abbrev,'Court\M','Ct');
abbrev=regexp_replace(abbrev,'Drive\M','Dr.');
abbrev=regexp_replace(abbrev,'Lane\M','Ln.');
*QA Tools*
And we can collaborate with QA tools .. for validating "values"
https://github.com/osm-fr/osmose-backend/blob/master/plugins/Website.py
like ... URL type values .. If not valid RFC URL .. we can safely drop ..
*"contact:webcam""contact:website""facebook""url""website:mobile"
"website:stock""website"*
other QA tools: https://wiki.openstreetmap.org/wiki/Quality_assurance
on the other hand - the validation is hard
(
https://www.mjt.me.uk/posts/falsehoods-programmers-believe-about-addresses/
https://wiesmann.codiferes.net/wordpress/?p=15187&lang=en
)
*OSRM ( routing) LUA transformations*
-
https://github.com/Project-OSRM/osrm-backend/blob/master/profiles/bicycle.lua
- https://github.com/Project-OSRM/osrm-backend/blob/master/profiles/car.lua
- https://github.com/Project-OSRM/osrm-backend/blob/master/profiles/foot.lua
*PostgreSQL*
Sometimes the tag transformation is *in SQL*
https://github.com/gravitystorm/openstreetmap-carto/blob/master/project.mml
for example - the oneway is not so simple:
CASE
WHEN oneway IN ('yes', '-1') THEN oneway
WHEN junction IN ('roundabout') AND (oneway IS NULL OR NOT
oneway IN ('no', 'reversible')) THEN 'yes'
ELSE NULL
END AS oneway,
or size of the telescope:
CASE
WHEN man_made IN ('telescope') THEN
CASE
WHEN tags->'telescope:diameter' ~ '^-?\d{1,4}(\.\d+)?$' THEN
(tags->'telescope:diameter')::NUMERIC
ELSE NULL
END
ELSE NULL
END AS "telescope:diameter",
if we can move this type of transformation to the osm2pgsql side .. the
rendering will be faster
We need some *Unicode cleaning rules*
Like my favorite problems with different apostrophes :
https://www.openstreetmap.org/user/ImreSamu/diary/34905
There are some "trap" tags .. detecting bad imports ( latitude, longitude,
lat, lon )
- https://taginfo.openstreetmap.org/search?q=latitude
- https://taginfo.openstreetmap.org/keys/LAT
We can easily drop this tags .. or create a QA list for the local
community
And there are some problematic keys (
https://taginfo.openstreetmap.org/reports/characters_in_keys )
*Other* tag transformation problems :
- converting to metrics ( "50 mph" -> km/h ; knots .. ) (
https://wiki.openstreetmap.org/wiki/Key:maxspeed#Parser )
- decoding special speed codes "FR:walk"/"DE:urban" .. (
https://wiki.openstreetmap.org/wiki/OSM_tags_for_routing/Maxspeed )
- injecting 3rd party data ( for example Wikidata labels , driving side
, country admin code )
- validating 3rd party links ( wikipedia links , wikidata id-s; website
urls , facebook links )
- validating osm "values" from the lookup tables / regexp ( example:
"building:color" )
- removing nested relations ( role=subarea)
https://github.com/osmcode/osmium-tool/issues/169
For an *ad-hoc transformation* ( osm.pbf -> osm.pbf ) we can use
- https://osmcode.org/opl-file-format/
- osm xml format
osmium cat input.osm.pbf -f xml --no-progress -o - | sed ... |
osmium cat - --input-format xml -o output.osm.pbf
- https://osmcode.org/pyosmium/ ( python )
- https://github.com/osmcode/node-osmium ( Javascript )
- ( osm.pbf ) Golang libs ...
- ....
Regards,
Imre
Sören Reinecke <tilmanreinecke at yahoo.de> ezt írta (időpont: 2019. dec. 6.,
P, 13:33):
> There are some available options I see:
> a) Not working on this further.
> b) Using `tagtransform for OSM` to create an own transformation
> specification.
> c) Writing converters which convert from a format to the format of
> `tagtransform for OSM` and writing converters to convert from the format of
> `tagtransform for OSM` to another format programs can work with. Using my
> specification which needs to be extended to create conpactibility among
> different formats while ensuring that my spec can be used on its own.
>
> What do you prefer or do you have another options to add?
>
> Cheers
>
> Sören Reinecke alias Valor Naram
>
>
> -------- Original Message --------
> Subject: Re: [OSM-dev] tagtransform for OSM - A effort make tagging and
> using OSM data easier; bridging different worlds together
> From: Imre Samu
> To: Sören Reinecke
> CC: OSM-Dev Openstreetmap
>
>
> > I currently write a specification for tranforming tags in OpenStreetMap
> to make life of data customers easier.
>
> imho: we can import some good ideas from
> https://wiki.openstreetmap.org/wiki/Osmosis/TagTransform schema ..
> *"The tag transform Osmosis plugin allows arbitrary tag transforms to be
> applied to OSM data as a preprocessing step before using other tools. This
> allows other tools to concentrate on doing what ever they do, without
> having to handle numerous different tagging schemes and error corrections."*
> imho: regexp is useful.
>
> probably the "lua" is good glue/meta language - for writing "business
> rules".
> some examples:
> Valhalla (routing) admin.lua (
> https://github.com/valhalla/valhalla/blob/master/lua/admin.lua )
> Valhalla (routing) graph.lua (
> https://github.com/valhalla/valhalla/blob/master/lua/graph.lua )
>
> osm2gsql - openstreetmap-carto.lua
>
> https://github.com/gravitystorm/openstreetmap-carto/blob/master/openstreetmap-carto.lua
>
> Regards,
> Imre
>
>
>
>
>
>
>
> Sören Reinecke via dev <dev at openstreetmap.org> ezt írta (időpont: 2019.
> dec. 5., Cs, 15:59):
>
>> Hey all,
>>
>> I currently write a specification for tranforming tags in OpenStreetMap
>> to make life of data customers easier. Different tagging schemes have
>> emerged since the existence of OpenStreetMap, same are existing in parallel
>> and a newer one deprecated an old one. Data customers without knowing the
>> OSM community much get lost. This project aims to help developers who want
>> to take advantage of the OpenStreetMap great database which is by the way a
>> brilliant project. This project can also help to make tagging in OSM more
>> orthogonal and more hassle free.
>>
>> I saw conflicting interests between OSM community, OSM developers like
>> the iD developers and data customers. A renderer might need data in another
>> way as the community contributes. The community might need another tagging
>> scheme than a renderer. I thought how we can resolve this, how we can get
>> all sites on "one table" and that is the idea I had come up with:
>>
>> A more readable version can be found here:
>> https://github.com/ValorNaram/transformation-table-osmtags/blob/master/README.md
>> and the principles can be found at
>> https://github.com/ValorNaram/transformation-table-osmtags/blob/master/principles.md
>>
>>
>> ------------------------------
>>
>> Example 1: They want to have the phone number of a POI. There are some
>> problems with this:
>>
>> 1. They need to know both contact:phone and phone to get them all.
>> 2. They need to support them both.
>> 3. They need to remove one in case both keys are mapped on one POI.
>> This really happens, see http://overpass-turbo.eu/s/OI2.
>>
>> Example 2: They want to know how many POI's have changing tables
>> (general: facilities for changing a nappy of a baby). There are some
>> problems with this too:
>>
>> 1. They need to know both changing_table and the deprecated diaper to
>> get them all.
>> 2. They need to support them both. Difficult because they're highly
>> different tagging schemes.
>> 3. They need to remove one in case both keys are mapped on one POI.
>> This really happens, see http://overpass-turbo.eu/s/OI5.
>>
>> Example 3: They want to develop a mapping tool and want to correct wrong
>> typed in tags. There are some problems with that:
>>
>> 1. They need to know all possible tagging schemes existing for one
>> purpose (e.g. mapping facilities for changing the nappy of a baby).
>> 2. They need to know the right/approved/more logical scheme.
>> 3. They need to know how to convert:
>>
>> diaper=yes
>> diaper:female=yes
>>
>> becomes after the transformation:
>>
>> changing_table=yes
>> changing_table:location=female_toilet
>>
>>
>> ------------------------------
>>
>> Basically it bridges the OSM community and developers together, it might
>> even resolve the conflict between iD developers and the community.
>>
>> The project bridges different worlds and is therefore a bridge. As bridge
>> this project should not just connect different worlds together and by
>> ensuring peace between those but also support exchange between those to
>> develop a social economy of "send and receive" This project should support
>> the "come together" of (OSM) developers and mappers.
>>
>>
>> *I want to hear your opinions on this and also feel not to shame to ask
>> questions. My answers will be part of the clarification of the project
>> "tagtransform for OSM".*
>>
>> Cheers
>>
>> Sören Reinecke alias ValorNaram
>>
>> _______________________________________________
>> dev mailing list
>> dev at openstreetmap.org
>> https://lists.openstreetmap.org/listinfo/dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20191207/6a356fd5/attachment-0001.html>
More information about the dev
mailing list