[osmosis-dev] calling DataPostbox.release() multiple times

Wed Jan 18 23:17:48 GMT 2012

On 2012-01-18 07:23, Brett Henderson wrote:
>
> On 16 January 2012 07:22, Richard Hansen <rhansen at bbn.com> wrote:
>
>> On 2012-01-15 07:34, Brett Henderson wrote:
>>
>>> You're correct in saying that Releasable classes can be released
>>> multiple times, however that is not typically how they're used
>>> in the Osmosis pipeline.
>>
>> Does it make sense to change the Releasable API definition to say
>> that release can only be called once?  Or might that break
>> multiple-sink stages like merge?
>
> I've updated the Releasable API definition to say that implementations
> should support release being called multiple times but that it isn't
> mandatory and cannot be relied on by clients.

Cool.  With that change, no other changes should be necessary.

> release() and outputRelease() have been designed to support re-use.  I
> don't think I've tested it, but I believe it will work.  The two
> variables you mention are initialized near the start of those methods
> and only read after the xxxxReleased flags have been set so their
> initial state isn't important.  However I can see that is confusing so
> I've modified their initial state to match their final state of true
> (I'll check it in shortly).

I agree with that change.  The only other thing I would suggest is a 
comment indicating that the class is designed to support reuse.  (Or are 
all tasks supposed to be reusable?)

>
> Your patch would probably work but I'd prefer to continue to support
> re-use rather than to use that variable for a second purpose.

Good point.  With your Releasable contract change, the patch is 
unnecessary anyway.

>
>> An alternative is to increase the burden on the users of
>> DataPostbox. They could be required to either ensure that the output
>> thread restarts or avoid calling DataPostbox.release() once the
>> output thread exits. For example, EntityBuffer.run() could set an
>> 'outputThreadIsRunning' boolean at the top and clear it at the end.
>> Then EntityBuffer.release() would be modified to only call
>> buffer.release() if outputThreadIsRunning is true.
>
> Race conditions between threads are tough to avoid, and I believe the
> above change would introduce one.  If the output thread hasn't entered
> EntityBuffer.run yet and the outputThreadIsRunning flag hasn't been set,
> but the calling thread encounters an early exception and calls
> EntityBuffer.release, the calling thread will think that the output
> thread isn't running and there is no need to perform release
> functionality.  This would then lead to the output thread blocking when
> it finally does start.

Yes, you are correct -- there is a race condition in that scheme. 
Concurrency is fun, isn't it?  :)

>
> The way the class is constructed now, each thread will synchronise at
> the initialize, complete and release methods.  If either thread skips
> straight to release

Oh, I didn't realize that was possible.  That makes it hard if not 
impossible to support both reuse and multiple calls to release().

> then the other thread detects that as an error
> condition and aborts appropriately, but in all cases both threads will
> synchronise around release before exiting.  The only reason to
> synchronise around release is to support re-use which may never be used
> in practice but it should work well.  We can't even rely on the
> initialize methods being called because it is possible for a thread to
> fail before that is called.
>
> I'd rather leave the class as it is designed now.  It's a tradeoff
> between supporting re-use and incurring the limitation that every call
> to release must be matched by a call to outputRelease in another
> thread.  I don't think it's possible to have it both ways.

I agree with your analysis.

Thanks for making the changes!

-Richard