[OSM-dev] OSM Wishlist
Paul Norman
penorman at mac.com
Sun Oct 14 09:05:31 BST 2012
On 12-Oct-12, at 3:50 PM, Iván Sánchez Ortega wrote:
>
> Also.
>
> ogr3osm.
>
> I would love to have the time and resources (or paid time,
> nudgenudgewinkwink)
> to redo ogr2osm; adding a backtracking-like algorithm to minimise
> the amount
> of geometries' shared nodes (and their bounding boxes) in memory,
> in order to
> be able to convert datasets with gazillions of geometries into .osm
> format.
> Backtracking-like in the sense that the data processing would be
> done in a
> tree-like fashion, walking through overlapping geometries,
> processing only
> geometries which have all their nodes already into the list of
> generated node
> IDs, writing to file and destroying from memory nodes that won't
> appear again
> because all the overlapping geometries have been processed.
>
> Yes, it sounds like a mouthful. I have a bunch of napkin notes with
> the
> algorithm written down, though :-)
I hate to steal your consulting work but I already redid the node
merging in ogr2osm :)
I have it check if there is an existing node on its list before
creating one - this turns out to be way quicker than checking after
the fact for nodes in common and removing one of them.
I haven't profiled it in awhile but the slowest step is now writing
the XML out with SimpleXMLWriter. I intend to evaluate switching to a
different library to get more performance. It didn't appear to be
disk bound, but spending all of its time in SimpleXMLWriter.
I'm giving a talk tomorrow on ogr2osm and might expand what I'm
saying about the node merging.
I ran some statistics on my in-progress NHD translation. This is a
fairly complex translation with involved logic, but it does drop a
few smaller layers. For a 600 MB .mdb (400 MB .shp) resultng in a 540
MB .osm it takes 12 minutes on my home server. It's CPU bound and
single-threaded. I think it uses about 6-7 gigs of ram for that. I
may be able to get that down substantially, I haven't really attacked
the RAM usage yet.
More information about the dev
mailing list