Hi Nic,<br><br>I'm not entirely sure what you mean by intermediate streams. Do you wish to split the execution across multiple Osmosis instances and stream data between processes? If so, there is no inbuilt support for this other than reading and writing from stdin and stdout respectively. Perhaps there's some way of using named pipes in Linux but that's not something I've ever tried. But the main overhead is typically in XML processing which will be encountered on the boundary point between processes with much the same CPU overhead as reading and writing from temporary files.<br>
<br>If the only reason for splitting Osmosis across multiple instances is to allow you to use more than 4GB memory total, then can you switch to a 64-bit JVM? That will let you use as much memory as you have in your system. I assume you're currently setting the -Xmx value to something less than 4GB based on a 32-bit JVM limitation.<br>
<br>As an FYI, the BitSet idTrackerType uses a fixed amount of memory per bounding box task dependent on the maximum ids in the planet file. So if you're extracting 60 bounding boxes, it doesn't matter where in the world the bounding box resides or how large it is, the BitSet will consume much the same memory. You need to find how many bounding boxes can fit within the 4GB memory limit and stick within that limit. The limit will decrease over time as the maximum ids in the planet increase. The IdList idTrackerType uses memory proportional to bounding box entity count, but it consumes much more data per entity (at least 32 times as much, maybe 64, haven't investigated this in detail) due to the fact that it holds each id in a sorted list instead of BitSet which stores each id as a single bit in a massive data array. IdList is great for large numbers of small bounding boxes where the total area covers a small portion of the planet. If somebody can code up with a better mechanism for storing these ids it can be plugged in as an alternative idTrackerType.<br>
<br>Brett<br><br><div class="gmail_quote">On Mon, Feb 8, 2010 at 7:54 PM, Nic Roets <span dir="ltr"><<a href="mailto:nroets@gmail.com">nroets@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
I'm trying to split the planet into 170 overlapping bboxes like this:<br><br><a href="http://dev.openstreetmap.de/gosmore/test/" target="_blank">http://dev.openstreetmap.de/gosmore/test/</a><br><br>But osmosis keeps running into the 4GB java limit, even after I made a split down the Atlantic.<br>
1. Split the planet into 3 bboxes: Americas, Europe / Africa /Asia /Australia and a bbox that is just large enough to cover all the bboxes that cross the dividing line.<br>2. Running osmosis for the 95 bboxes in the Americas fails.<br>
3. Running osmosis for the 12 Atlantic bboxes succeeds.<br>4. Running osmosis for the 60 bboxes in Europe / Africa /Asia /Australia fails:<br><br>gunzip <middle.osm.gz | ionice -c 3 nice -n 19 osmosis --read-xml enableDateParsing=no file=/dev/stdin --tee 60 \<br>
--bb idTrackerType="BitSet" left=73.12500 right=180.00000 top=9.44906 bottom=-85.05113 --wx 0720048510241024.osm.gz \<br> --bb idTrackerType="BitSet" left=120.58594 right=180.00000 top=72.91964 bottom=-25.48295 --wx 0855020310240587.osm.gz \<br>
--bb idTrackerType="BitSet" left=98.43750 right=172.61719 top=13.23995 bottom=-85.05113 --wx 0792047410031024.osm.gz \<br> --bb idTrackerType="BitSet" left=100.19531 right=150.82031 top=30.14513 bottom=-75.84517 --wx 0797042209410852.osm.gz \<br>
...<br><br>The obvious solution is just to repeat this algorithm until I find something that will work. And the number of candidate splitting latitudes and longitudes is small (4 times the number of bboxes), so evaluating them all in software is feasible (esp. with dynamic programming).<br>
<br>Now my question is: Can I tell osmosis to work with intermediate streams ? That would remove the need to gzip / gunzip and write / read from disk.<br></blockquote></div><br>