[OSM-dev] Osmosis multi-task questions

Mon Nov 26 22:57:08 GMT 2007

Hi,

   I'm thinking about using Osmosis to produce daily "mini planets"
for a number of areas, e.g. one each for Germany and the neighbouring
countries, but then also one each for the 16 German "Länder" (states).

There are probably no 100% answers for the following but maybe someone
has experimented with these things and has some ideas/insights on the
best procedures.

I am thinking about this:

1. Each week, get full planet file and do a bounding box extract for
Europe.

2. Daily, apply the diff to this file. This will add a little non-Europe data
each day but I can ignore that.

3. Daily, split out desired polygons from patched planet file.

Assuming for a moment that I'd not use a database; would it be
possible (and sensible) to use the "tee" task in Osmosis to branch
from one --read-xml into 20 --border-polygon/--write-xml tasks, so
that all areas I am interested in get cut out in one go? I guess I
would have to create a very long command line naming all the output
pipes of the tee and assigning them to each of the --border-polygon
tasks, right?

Can I make use of the default pipe connection between the
border-polygon and write-xml if I pair them, e.g.

osmosis 
   --rx ... (use default pipe) --tee (name 20 output pipes) 
   --bp (name 1 input pipe but use default pipe on output) --wx 
   --bp (name another input pipe, again use default pipe on out) --wx
   ...

In a case like the one sketched initially - select 5 neighbouring
regions plus 16 regions inside one of the selected - could I even 
construct a command line that would do something like this:

read xml -> tee into 5 pipes, each with own bounding polygon, four
of them written directly to file, fifth tees into 16 pipes, one of
which is written directly to file, the others are again used as
bounding polygon inputs and then written

Could I expect better performance by using a Mysql database which is
reset every week and has the diffs applied to it? When extracting
polygons from the Mysql instead of from an XML file, would the same
"tee" strategy make sense or would Mysql reading be fast enough to
just extract every polygon sequentially? I see that the read-mysql
task has no option to make use of Mysql indexes for selecting bounding
boxes, so I assume it would always feed the full data set into the
select_polygon which might be less than ideal. Maybe I could trick
Osmosis into operating on a "view" of a node table that contains only
nodes in a certain lat/lon range?

The whole thing is to be run on a quad-core machine with 8 GB and
reasonably fast disk arrays.

Cheers
Frederik

-- 
Frederik Ramm  ##  eMail frederik at remote.org  ##  N49°00.09' E008°23.33'