[OSM-dev] Question running osmosis (node-key-value and way-key-value at the same time)

Sun Jun 28 09:37:04 BST 2009

Christoph Eckert wrote:
> Hi,
>
>   
>> Well, I wanted to try to create a kind of worldwide "base map" for Navit to
>> get a clue what file size this would trigger.
>>     
>
> OK, I meanwhile got a bit further thanks to the various hints. I use the 
> following command:
> ./osmosis-0.31/bin/osmosis --read-xml file="planet.osm" --way-key-value
> keyValueList="highway.motorway,highway.motorway_link,highway.motorway_junction,
> highway.trunk,highway.trunk_link,highway.primary,highway.primary_link,route.ferry,
> waterway.river,railway.rail,railway.narrow_gauge,landuse.forest,landuse.wood,
> natural.wood,natural.water,boundary.administrative,boundary.civil"
> --used-node --write-xml file="basemap.osm"
>
> This seems to work well on smaller files (tried 24.8MB), but I tried it twice 
> on a true planet file, and osmosis will crash after a while (link to pastebin 
> log is attached). It creates the temporary files in /temp, but it seems to 
> fail as soon as it tries to staff the temporary file's content into the 
> destination file. After the crash, the latter one only contains the XML 
> header up to the maximum possible bounding box, but nothing else. The 
> temporary files are removed. The failure seems to depend on the usage 
> of --used-node.
>
> So my question is if it won't work for tech limitations on such huge files 
> (data must fit into RAM), or if it is intended to work and I can use some 
> workaround.
>   
As you've guessed, you are running into a memory limitation.  A 32-bit 
java VM can use up to about 2GB if necessary if specified with the -Xmx 
option, but most times that isn't necessary.

The complete set of nodes does need to be in RAM, however you can reduce 
the amount of RAM required by changing the method used to hold them.  
Try adding the idTrackerType=BitSet option to the --used-node task.  
That will use approximately 1/32 of the RAM that the default IdList 
implementation uses.  Both methods have their advantages and 
disadvantages, with BitSet being more memory efficient on very large 
data sets, and IdList being more memory efficient on smaller data data 
sets with sparsely allocated ids.

I hope the BitSet option solves the problem.

Brett

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20090628/19563047/attachment.html>