[OSM-dev] Question running osmosis (node-key-value and way-key-value at the same time)
Brett Henderson
brett at bretth.com
Wed Jun 24 13:25:13 BST 2009
Karl Newman wrote:
> On Tue, Jun 23, 2009 at 6:11 PM, Brett Henderson <brett at bretth.com
> <mailto:brett at bretth.com>> wrote:
>
> Karl Newman wrote:
>
>
> What's happening there is that the node-key-value and
> way-key-value are ANDed together (which would leave you with
> only ways which match your tags and are composed of nodes
> tagged place=city), and you want an OR instead. You were sort
> of on the right track with the pipes, but what you need to do
> is use the "tee" function and apply the node-key-value filter
> to one leg of the tee, and apply the way-key-value filter to
> the other leg of the tee, then use the "merge" function to
> join the results. It would look something like this:
>
> ./osmosis-0.31/bin/osmosis --read-xml file="planet.bz2" --tee
> outputCount=2 outPipe.0="nodes" outPipe.1="ways"
> --node-key-value keyValueList="place.city" inPipe.0="nodes"
> --way-key-value
> keyValueList="highway.motorway,highway.motorway_link,highway.motorway_junction,highway.trunk,highway.trunk_link"
> inPipe.0="ways" --merge --write-xml file="basemap.osm"
>
> I'm not sure if that will work exactly as written. You may
> need to add outPipe arguments to the node-key-value and
> way-key-value filters and then reference them as inPipe.0 and
> inPipe.1 arguments to the merge task.
>
> You may trigger a deadlock in this situation ... I've been waiting
> for somebody to try this out for a long time :-)
>
> While it's possible to construct a pipeline that tees a single
> dataset into multiple streams before merging them back together
> again, it is problematic from thread synchronisation point of view
> because the same input thread is feeding two inputs of another
> thread. Using a --buffer task within both paths of the branch may
> help because it de-couples the threads somewhat with a buffer.
>
> The --read-xml task creates a thread which passes data into the
> --tee task. The --tee task doesn't create a thread, it just uses
> the existing thread to pass incoming data to all consumers. The
> --node-key-value and --way-key-value also use the existing thread
> to write to their destination which in both cases is the --merge
> task. The --merge task creates a new thread which reads the
> incoming data from both of its inputs, but both inputs are coming
> from a single thread (ie. the original --read-xml thread). The
> --merge thread may read from one input, then start waiting for a
> specific value on the the other input and never receive it.
>
> But if it works let me know. I'm curious :-)
>
> Brett
>
>
> Hmm... I didn't look at the code too closely. I thought the tee
> created separate threads. I'm trying to see what might cause a
> deadlock--it looks like it would happen in DataPostBox if
> anywhere--but it's not obvious what might trigger it. I guess what
> could happen is if the merge task is trying to get entities from both
> pipelines to compare them, and there isn't anything available in one
> of the pipelines, that might deadlock it. I guess someone will have to
> try it and see!
Yep, DataPostBox is the point at which threads will deadlock. I think
it is the only class in the whole of osmosis that performs any thread
coordination ...
Thinking further, it is almost guaranteed to deadlock in this scenario.
The merge task requires data from both inputs to perform a comparison.
The two --node-key-value task will produce data while nodes are
available on input, and the --way-key-value task will produce data while
ways are available on input. So at the node data has filled the
DataPostbox for one input of the merge task, there will still be no way
data available on the second input of the merge task and therefore the
merge task will block. The original --read-xml task thread will then
block because it is waiting for the full buffer of the first merge task
input DataPostbox to clear.
Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20090624/fe7ca5fb/attachment.html>
More information about the dev
mailing list