[Imports] New module to merge-sort imports over time (osmfetch python)

Bryce Nesbitt bryce2 at obviously.com
Thu Aug 25 20:50:52 UTC 2011


I'm writing to get feedback and peer review on a new python module to 
manage imports. And yes I am aware of the various concerns and issues 
with imports, and the strong feelings they create.

The goal of this module is to help make imports more robust and correct 
over the long haul.   As a side effect the module makes imports dead 
easy to code.

This came from a work-related project to mirror car sharing locations 
into osm.  The exact choice of tags and methods is subject to review and 
I do seek your feedback.  I wrote it a while back but have dusted it off 
recently.

------------------------------------------------------------------------
The module performs a mirroring operation.  The external dataset must be 
a high quality and authoritative source, with proper licensing.

On the first run the import is conventional, except for some added tags:
     source=osmfetch:xxx
     source:pkey=yyy
     source:licence=llll
     source:website=http://zzz

On subsequent runs the module performs a merge-sort.  Certain keys are 
considered master in the source.  For car sharing 
<iki.openstreetmap.org/wiki/Tag:amenity%3Dcar_sharing> that might be the 
count of cars at the location,  the phone number, and of course the 
primary key:
     contact:phone=510-555-1212
     number=3
     vehicles=Toyota Prius,Zastava Yugo,BatMobile
     operator=Hippie Car Cooperative
     source:pkey=102

Other keys are left alone.  Osm mappers are welcome to adjust the 
coordinates for example (moving nodes more than 100 meters triggers a 
warning to the car sharing operator, but the osm coordinates are not 
touched).

Destroying the source:pkey orsource=osmfetch tag disconnects the 
conflation process and could result in duplicates.  No record of the 
originally imported osmid is kept, and the entire process is stateless.


------------------------------------------------------------------------
The import script is meant to be run on a cron job, alerting a human 
when changes are ready to evaluate.  The output format is presently JOSM 
compatible XML, ready for human review prior to merging.  The JOSM file 
lists nodes to add, delete and modify.

This style of import is a perfect fit for the Car Sharing application.  
I thought originally there would be many similar sets, but in the end 
suitable sources seem few and far between.


------------------------------------------------------------------------
You can see some mirrored nodes here
http://taginfo.openstreetmap.org/search?q=osmfetch%3Accs#values
Prior to the import hand mappers had covered only 3 of the 150 locations.

And what follows is an example control file.  This one has never been 
run live, as it duplicates an existing import.  If run live it would 
effectively update or freshen the import bringing in any missing nodes 
and deleting obsolete ones.

Thus an external dataset (be it corporate or community) can be reflected 
without error or bit-rot in OpenStreetMap.

What do you think of it?  Is the python clear enough for general use?


------------------------------------------------------------------------
!/usr/bin/python
##
##  Author: Bryce Nesbitt, June 2011
##  Licence: Public Domain, no rights reserved
##
##  DEMOSTRATION osmfetch module to import NOAA NEXRAD radar stations
##
##  See also:
## 
http://wiki.openstreetmap.org/wiki/Potential_Datasources#Next_Generation_Radar_.28NEXRAD.29_Locations
## http://wiki.openstreetmap.org/wiki/Man_made
##
##  Future work:
##
from osmfetch import osmfetch

import sys, re, urllib, urllib2
import zipfile

from   pprint     import pprint
from   xml.etree  import ElementTree

class osmfetch_noaa_nexrad(osmfetch):

     #  Sample noaa data:
     # <wsr>
     # <name>KABR</name>
     # <description><![CDATA[SITE: KABR<BR>LOCATION: 
ABERDEEN<BR>...]]></description>
     # <Point>
     # <coordinates>-98.413,45.45600000000001,0</coordinates>
     # </Point>
     # </wsr>
     def fetch_source(self, sourcedata):

         sourcenodes = {}
         plaintext_stream = zipfile.ZipFile(osmfetch,'r')
         tree    = ElementTree.parse(plaintext_stream.open('doc.kml'))

         for site in tree.iter('{http://earth.google.com/kml/2.0}wsr'):
             pkey            = 
site.find("{http://earth.google.com/kml/2.0}name").text.strip()
             description     = 
site.find("{http://earth.google.com/kml/2.0}description").text.strip()
             point           = 
site.find("{http://earth.google.com/kml/2.0}Point")
             lat,lon,ele     = 
point.find("{http://earth.google.com/kml/2.0}coordinates").text.split(',')

             node            = {}
             node['tag']     = {}
             node['id']      = pkey
             node['lat']     = lat
             node['lon']     = lon
            #node['tag']['ele']               = (sites do not have 
reliable elevation)
             node['tag']['source:pkey']       = pkey
             node['tag']['man_made']          = 'beacon'
             node['tag']['radar_transponder'] = 'NEXRAD'
             node['tag']['note']              = description
            #node['tag']['operator']          = 'NOAA'
            #node['tag']['website']  = 
"http://www.ncdc.noaa.gov/nexradinv/chooseday.jsp?id="+pkey
             sourcenodes[pkey] = node

         source_is_master_for=['operator','website','description']
         return(sourcenodes, source_is_master_for)

     #   Map source primary keys to osm primary keys
     #   Override only if your primary key is not stored in tag 
'source:pkey'
     def map_primary_keys(self, sourcenodes, osmnodes):
         mapping = {}
         if osmnodes:
             for osmid,node in osmnodes.items():
                 pkey = node['tag'].get('name')
                 mapping[pkey]=osmid
         return mapping

     #   Record the action (e.g. modify, add or delete)
     #   Use your override to add, munge, or delete tags
     #   If the action is 'none' but you want to change tags, alter the 
action to read 'modify'
     def record_action(self, osmnode, action):
         osmnode['tag']['source']         = 'osmfetch:noaa:nexrad'
         osmnode['tag']['source:website'] = 
'http://www.ncdc.noaa.gov/oa/radar/nexrad.kmz'
         return(osmfetch.record_action(self, osmnode, action))

#############################################################################################
myfetch = osmfetch_noaa_nexrad()
myfetch.run(description='NOAA NEXRAD Weather Radar Import',
             ext_url='http://www.ncdc.noaa.gov/oa/radar/nexrad.kmz',
             osm_url=osmfetch.xapi_url + 
urllib.quote('node[radar_transponder=NEXRAD]')
             )
myfetch.output()


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/imports/attachments/20110825/431852a1/attachment-0001.html>


More information about the Imports mailing list