[OSM-talk] Re: [OSM-dev] MassGIS dataset

Christopher Schmidt crschmidt at crschmidt.net
Fri Jun 9 12:27:25 BST 2006


On Fri, Jun 09, 2006 at 09:46:49AM +0100, Nick Whitelegg wrote:
> Sent by:        dev-bounces at openstreetmap.org
> To:     "Christopher Schmidt" <crschmidt at crschmidt.net>
> cc:     Dev Openstreetmap <dev at openstreetmap.org> 
> Subject:        Re: [OSM-talk] Re: [OSM-dev] MassGIS dataset
> 
> On 08/06/06, Christopher Schmidt <crschmidt at crschmidt.net> wrote:
> >> On Thu, Jun 08, 2006 at 08:02:23AM +0100, SteveC wrote:
> >> > There's a slightly quicker route - someone could write a version of
> .> > ox.rb to just print xml rather than create a big DOM (and escape any 
> tag
> >> > data).
> >>
> >> A Python version of this looks like:
> >> http://crschmidt.net/projects/openstreetmap/ox.py

> Chris,
> 
> Looked at your code and (while I have not coded in Python before) it 
> appears to be storing the nodes, segments and ways as data objects. For a 
> script optimised for planet.osm generation, what we really need is 
> something which simply queries the database and writes the XML direct to 
> standard output, no data storage at all as that is what, I believe, was 
> causing the memory issues.

Nope. It does, in some cases, create a string object in memory, but that
goes out of scope at the end of the function call. Although I understand
some of the nodes are big, I know that storing stuff in the dom means
that you end up using about 10 times as much memory as the actual size
of the elements. Until we run into 100MB long single strings of nodes +
tags + segs, this should scale okay. I don't think we're quite there
yet, since our total dump size is 80MB :)

One thing that may be confusing is that I am passing data objects to the
functions -- but that doesn't mean I have to create more than one at a
time.

If you do have any interest in understanding the Python code, lemme know
and I'll annotate it for you.

> I started a Ruby version late last night but will not be able to return to 
> it until about 2100 BST today so if you want to complete yours in the 
> meantime, fine :-)

Well, I don't really know where to go from here, cause no one's offered
any hints as to what actually calls this code in the planet.osm dump, so
I'm blocking on that. Given that, you'll probably finish first. :)

-- 
Christopher Schmidt
Web Developer




More information about the dev mailing list