[Talk-se] Ortnamnsimport från Lantmäteriets GSD-Terrängkartan

Grigory Rechistov ggg_mail at inbox.ru
Sun Jan 19 14:44:36 UTC 2020


Hej,
Jag fixade ett fel i skriptet. Det upptäcktes att gamla noders  begränsningsramar var för stora (förväxlade latitud och longitud i en plats) vilket ledde till större antal falska positiva matchningar. Med andra ord, bestämdes ibland två lika nämnda noder som dubbletter trots att de var långt borta varandra. Efter rättelsen började fler nya noder "överleva" konflationen .
 
En ny förändring till är att gamla (multi)polygoner med namn (name=*, landuse=farmyard eller landuse=residential osv) nu matchas mot nya noder. Om deras koordinater och namn är lika markeras den nya noden med en etikett "import:note". Sedan kan man filtrera sådana nya noder och till exempel radera dem om man tror att endast nämnda polygoner bör finnas och inte noder.
 
Nya filer v13:  https://drive.google.com/open?id=1pZhZhKhS_7JDqxal9QSjDTj1-YIM2LxW 
Kika i taggar för att se vilka nya noder matchade mot vilka gamla polygoner, t. ex. en beskrivning i en etikett lyder:
<tag k="import:note" v="Cross-check against way 438134513, similar names"/>
 
Jag har beskrivit dessa förändringar och andra omdiskuterade tillfällen i importplanen, här är utdraget:
Technical and diagnostic tags
In addition to the tags derived from the source dataset, auxiliary tags are added to all or some new nodes.
The following tags are added.
*  import=yes
*  source="GSD-Terrängkartan"
*  "lantmateriet:kkod" to store the original KKOD value.
*  fixme=<description> for nodes with likely incorrect names, such as ending with a dash, starting from a lower case symbol etc.
*  note=<description> for nodes which names were reconstructed.
*  short_name to keep the original abbreviated name
*  import:note = <description> for nodes having names similar to old (multi)polygons.
...
Node having same alternative name as existing node
For example, adding a node with name="Gullåkra by" near an old node with name="Gullåkra".
Probability: low. There should not be many variations of names. Existing conflation script checks for alternative names.
Impact: low. A human will easily be able to recognize the error and dismiss it.
Effort to discover: medium. Map has to be visually scanned for suspicious node pairs.
Effort to fix: low. Delete one node, add "alt_name" to the other. If needed, the conflation script can deal with it by utilizing more advanced fuzzy name comparison.
 
Node having same name as existing closed way
Tag "name=*" can be placed not only on nodes, but also on (multi)polygons encircling settlements, such as landuse=residential, landuse=farmyard etc.
Probability: high. There are regions with hundres of such (multi)polygons.
Impact: low to medium (currently being debated). It is customary for certain mappers to map settlements with  both a name on its closed way and as a separate node with "place=*" inside its border. One reason behind it is that a node can be placed at a "logical", "economical" or political center, such as the main square, train station etc. Compared to this, a geometric center of (multi)polygon is hard to control, and it may land somewhere completely non-representative for the settlement.
Effort to discover: low. It is automated (since b4973ffe) to treat closed named ways as pseudo-nodes, apply the same conflation strategy and mark matches with import:note = *
Effort to fix: low. If needed, the conflation script can be adjusted to address it.

----------------------------------------
Jag kommer att svara på frågor/anmärkningar i mejltråden senare. Tack!
 
 
Med vänliga hälsningar,
Grigory Rechistov
With best regards,
Grigory Rechistov
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-se/attachments/20200119/90e47782/attachment.htm>


More information about the Talk-se mailing list