[OSM-dev] SAX XML parsing recommendations?

Christopher Schmidt crschmidt at crschmidt.net
Mon Jul 10 21:23:05 BST 2006


On Mon, Jul 10, 2006 at 09:14:22PM +0100, Nick Whitelegg wrote:
> Hello everyone,
> 
> Does anyone have any recommendations for reliable and (ideally) fast SAX xml 
> parsers with an intuitive api? I need something to convert planet.osm into 
> SQL but have run into the following problems:
> 
> the PHP SAX parser: seems to be unreliable and runs out of memory easily
> 
> REXML (Ruby): seems to work, but takes a long time (27 hours to get the UK out 
> of the April planet.osm. Oddly for a SAX parser, it took considerably longer 
> to get the whole UK out compared to a small tile of 0.1 by 0.1 degrees - this 
> suggests high usage of memory though my script uses as little as possible)

sxpert's work on a planet.osm parser is what led to my:
  http://london.freemap.in/osm2gml_simple.py.txt
which uses a sax parser -- it shouldn't be too difficult to change the
output from GML to SQL. However, because of the need to store points in
memory, it does take up about 600MB of RAM -- if you don't want that, a
better rewrite could probably modify the endElement and startElement
calls to spit out SQL instead of storing the data in arrays that are
printed out by the endElement call.

-- 
Christopher Schmidt
Web Developer




More information about the dev mailing list