[OSM-talk] Osmosis running forever with completeWays=yes?

Brett Henderson brett at bretth.com
Mon Feb 2 04:18:19 GMT 2009

Karl Newman wrote:
> On Sun, Feb 1, 2009 at 1:55 PM, Frederik Ramm <frederik at remote.org 
> <mailto:frederik at remote.org>> wrote:
>     Hi,
>       (f'up set to osmosis-dev)
>     Karl Newman wrote:
>         Anyway, the tee can choke things up with all the temporary
>         files. It would
>         be nice to be able to share the stored node and ways files
>         between tee
>         tasks, but I haven't created that infrastructure yet.
>     It would be even better to have an extended --bp task that somehow
>     takes a list of disjoint polygons and uses some kind of point
>     location algorithm to determine which node belongs to which
>     polygon. The rationale being of course that with the classic
>     --bp/--tee approach, each node is duplicated n times and tested
>     against each of the polygons which is a waste of time, especially
>     with a large input file and many polygons (e.g. split up the US
>     into counties or so).
>     Does the task and stream model that osmosis uses theoretically
>     support tasks where the number of output streams they create is
>     not fixed, but dependent on their parameters? So that e.g. a "bp
>     file=a.poly file=b.poly" (or "bp files=a.poly,b.poly") creates two
>     entity streams and so on?
>     Bye
>     Frederik
> What you're asking is possible. The number of input and output pipes 
> has to be known at invocation because the pipes are connected before 
> any tasks are run, but if it's a parameter passed to the task, then 
> the task can report to the pipeline manager how many output pipes it 
> has. The tricky part might be connecting the downstream tasks. It 
> might be confusing because of the stack-based pipeline ordering.
If you want to see how this works, check out the SinkMultiSource 
interface which defines a task with a single input and multiple 
outputs.  It is implemented by the EntityTee class which is the --tee 
task.  It is integrated into the pipeline by the SinkMultiSourceManager 

The SinkMultiSource interface defines a method called getSourceCount 
which allow tasks to tell the manager how many pipe outputs they have.  
It is called by SinkMultiSourceManager during pipeline startup.

More information about the talk mailing list