[Talk-ca] geobase2osm.py work / questions

Steve Singer ssinger_pg at sympatico.ca
Fri Jun 26 14:00:21 BST 2009


On Fri, 26 Jun 2009, William Lachance wrote:

> Hi all,
>
> I've been needing to use geobase2osm in a project of mine (not
> directly related to the geobase import), and noticed (as other people
> have) that it's generating a lot of duplicated nodes. Aside from being
> unclean and wasteful, this makes the generated osm file non-routable.
> Well, today I decided to do something about it. Looking at the source,
> I noticed quite a bit of complicated code to "merge" identical OSM
> nodes at junctions (trusting geobase to provide the right hints of
> where junctions are). Evidently this wasn't working quite as expected.


>
> I decided to try something simpler: just keep a hashtable of node
> "keys" (latitude and longitude) as you go along, and reuse nodes that
> have occurred before. :) At first glance, this approach seems to work
> fairly well. It seems intuitively "right" to me that if two ways have
> a common lat/lng point in common, they should be connected. A quick
> count of running the two algorithms on Edmonton, Alberta, reveals that
> where the old script resulted in 77242 (!) overlapping nodes
> (sometimes up to 8 on a single lat/lng position!), my code resulted in
> 0.

Is this because the geobase data doesn't define Junction objects for those 
positions or some other reasons? Can you post a few examples of these?
I wasn't aware that recent versions of the script were still doing this. (I 
probably won't get to look into the details for a few days though)


If the junction data turns out to be incomplete then maybe we are better 
merging all common lat/lon pairs.


>
> Incidentally, this technique (if it's a good one) could be applied in
> automatic fashion on existing portions of the geobase import,
> eliminating the need for tedious manual work.

What does your hash do to the memory usage for the script. On the systems I 
run it on RAM tends to be a more limiting factor over other things.


>
> Anyway, I don't have commit access to the OpenStreetMap subversion
> repository holding geobase2osm, so I decided to fork the repository
> using git and put up the result here:
>
> http://github.com/wlach/geobase2osm
>
> Oh, I also patched it up to not do coordinate transforms between the
> NAD83 and WGS84 systems, as apparently
> (http://sci.tech-archive.net/Archive/sci.geo.satellite-nav/2006-09/msg00307.html)
> there is no difference when it comes to positioning. Empirically, this
> seems to be true: the output without this transform seems 100% fine.
> Dropping these transforms allows us to drop the osgeo dependancy,
> which makes the whole thing run on OpenSUSE in the first place.


According to that link the there is a 1.5m offset between the two coordinate 
systems.  That doesn't seem insignificant, maybe someone with a stronger GIS 
background can comment further on how important the correction is.



>
> More exciting geobase2osm work coming later. Maybe.
>
> Questions/comments? Let me know!
>
> -- 
> William Lachance
> wrlach at gmail.com
>
> _______________________________________________
> Talk-ca mailing list
> Talk-ca at openstreetmap.org
> http://lists.openstreetmap.org/listinfo/talk-ca
>





More information about the Talk-ca mailing list