[OSM-dev] OSM Wishlist

Paul Norman penorman at mac.com
Sun Oct 14 09:05:31 BST 2012


On 12-Oct-12, at 3:50 PM, Iván Sánchez Ortega wrote:

>
> Also.
>
> ogr3osm.
>
> I would love to have the time and resources (or paid time,  
> nudgenudgewinkwink)
> to redo ogr2osm; adding a backtracking-like algorithm to minimise  
> the amount
> of geometries' shared nodes (and their bounding boxes) in memory,  
> in order to
> be able to convert datasets with gazillions of geometries into .osm  
> format.
> Backtracking-like in the sense that the data processing would be  
> done in a
> tree-like fashion, walking through overlapping geometries,  
> processing only
> geometries which have all their nodes already into the list of  
> generated node
> IDs, writing to file and destroying from memory nodes that won't  
> appear again
> because all the overlapping geometries have been processed.
>
> Yes, it sounds like a mouthful. I have a bunch of napkin notes with  
> the
> algorithm written down, though :-)

I hate to steal your consulting work but I already redid the node  
merging in ogr2osm :)

I have it check if there is an existing node on its list before  
creating one - this turns out to be way quicker than checking after  
the fact for nodes in common and removing one of them.

I haven't profiled it in awhile but the slowest step is now writing  
the XML out with SimpleXMLWriter. I intend to evaluate switching to a  
different library to get more performance. It didn't appear to be  
disk bound, but spending all of its time in SimpleXMLWriter.

I'm giving a talk tomorrow on ogr2osm and might expand what I'm  
saying about the node merging.

I ran some statistics on my in-progress NHD translation. This is a  
fairly complex translation with involved logic, but it does drop a  
few smaller layers. For a 600 MB .mdb (400 MB .shp) resultng in a 540  
MB .osm it takes 12 minutes on my home server. It's CPU bound and  
single-threaded. I think it uses about 6-7 gigs of ram for that. I  
may be able to get that down substantially, I haven't really attacked  
the RAM usage yet.





More information about the dev mailing list