[osmosis-dev] Improving completeWays/completeRelations performance

Brett Henderson brett at bretth.com
Fri Mar 4 10:30:57 GMT 2011


Hi Frederik,

On Fri, Feb 18, 2011 at 7:57 PM, Frederik Ramm <frederik at remote.org> wrote:

> But I've been thinking: With the the high performance of PBF reading, a
> two-pass operation should become possible. Simply read the input file twice,
> determining which objects to copy in pass 1, and actually copying them in
> pass 2. I'm just not sure how that could be made to fit in Osmosis. One way
> could be creating a special type of file, a "selection list", from a given
> entity stream. A new task "--write-seelction-list" would dump the IDs of all
> nodes, ways, and relations that were either present or referenced in the
> entity stream:
>

Your suggestion of creating a selection list file to support a two-pass
approach seems quite reasonable, and would avoid making any changes to the
pipeline design.

I've often thought about ways of allowing tasks to receive their input data
stream twice, but can't think of a way to do it without greatly complicating
the entire pipeline.  My latest idea was for Sink tasks (eg. your --multi-bb
task) to throw an exception (eg. ResendDataException) that could be caught
by Source tasks (eg. --read-xml) triggering them to re-start processing.  It
wouldn't be terribly difficult to add support for it to tasks like
--read-xml, but it starts to get messy in the pipeline control tasks like
--tee and --buffer.  I'd have to re-write the DataPostbox class which forms
the basis of all thread interaction to fix --buffer, and --tee would have to
track which downstream pipes had requested a restart then trigger a single
restart when all downstream pipes had finished with the first pass.

One thing I like about the exception approach to restarts is that it could
be rolled into existing tasks incrementally.  Any tasks not supporting it
would abort cleanly when they received the exception.

I started experimenting with this idea this evening, but I've had to abandon
the idea because I don't have the time to see it through.

Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/osmosis-dev/attachments/20110304/5c3bc954/attachment.html>


More information about the osmosis-dev mailing list