[Talk-ca] geobase2osm.py work / questions
William Lachance
wrlach at gmail.com
Fri Jun 26 05:13:37 BST 2009
Hi all,
I've been needing to use geobase2osm in a project of mine (not
directly related to the geobase import), and noticed (as other people
have) that it's generating a lot of duplicated nodes. Aside from being
unclean and wasteful, this makes the generated osm file non-routable.
Well, today I decided to do something about it. Looking at the source,
I noticed quite a bit of complicated code to "merge" identical OSM
nodes at junctions (trusting geobase to provide the right hints of
where junctions are). Evidently this wasn't working quite as expected.
I decided to try something simpler: just keep a hashtable of node
"keys" (latitude and longitude) as you go along, and reuse nodes that
have occurred before. :) At first glance, this approach seems to work
fairly well. It seems intuitively "right" to me that if two ways have
a common lat/lng point in common, they should be connected. A quick
count of running the two algorithms on Edmonton, Alberta, reveals that
where the old script resulted in 77242 (!) overlapping nodes
(sometimes up to 8 on a single lat/lng position!), my code resulted in
0.
Incidentally, this technique (if it's a good one) could be applied in
automatic fashion on existing portions of the geobase import,
eliminating the need for tedious manual work.
Anyway, I don't have commit access to the OpenStreetMap subversion
repository holding geobase2osm, so I decided to fork the repository
using git and put up the result here:
http://github.com/wlach/geobase2osm
Oh, I also patched it up to not do coordinate transforms between the
NAD83 and WGS84 systems, as apparently
(http://sci.tech-archive.net/Archive/sci.geo.satellite-nav/2006-09/msg00307.html)
there is no difference when it comes to positioning. Empirically, this
seems to be true: the output without this transform seems 100% fine.
Dropping these transforms allows us to drop the osgeo dependancy,
which makes the whole thing run on OpenSUSE in the first place.
More exciting geobase2osm work coming later. Maybe.
Questions/comments? Let me know!
--
William Lachance
wrlach at gmail.com
More information about the Talk-ca
mailing list