[OSM-dev] [Tile-serving] mod_tile causes segfault on debian 7.0
Kai Krueger
kakrueger at gmail.com
Wed May 22 17:54:34 UTC 2013
On 05/22/2013 11:27 AM, Tom Hughes wrote:
> On 22/05/13 18:20, Kai Krueger wrote:
>
>> On 05/22/2013 11:04 AM, Tom Hughes wrote:
>>
>>> So if a new child is started, and multiple requests arrive more or
>>> less simultaneously to different threads in that process, then they
>>> will both try and allocate the stores array which means they will both
>>> be trying to manipulate the memory pool at the same time.
> >
>> The apache routines to manipulate memory pools should be thread safe, so
>> that part should be fine.
>
> That's not what the interwebs are telling me - do you have some
> documentation for that claim? Only I'm finding quotes like "Pools are
> explicitly thread unsafe".
Ouch. Looks like you are right. It sais functions like apr_pool_create
are thread-safe, but those are only the ones to create new pools, not
the general functions.
So that needs fixing and probably the rest of mod_tile checked to see if
those functions are used incorrectly anywhere else.
I guess the upside is, that that possibly means we have found the cause
and can relatively easily fix it and don't have to go on a long debuging
hunt. Thanks.
>
> That's why you have things like per-request pools - so that you can do
> allocations in request context without locking overheads as well as so
> you can clean up easily.
>
>> It does look like it is possible that multiple processes can allocate a
>> new storage array simultaneously, but that should "only" lead to memory
>> leak, rather than crashes. In that race, simply one of the threads wins
>> and gets to set the stores array and the other allocated arrays go
>> unused. As all allocations are equivalent, it shouldn't matter which
>> wins.
>>
>> That race should be fixable, by simply adding an explicit lock after the
>> stores==null check. As this only happens at process / thread
>> initialisation and all operations are fast, the performance impact of
>> that should be negligible.
>
> Can the stores array not just be allocated in the child_init hook?
I can't remember the details, but I believe child_init hook did not do
at all what I wanted. I think it might have again only been per process
and not per thread.
>
> That way it is only the apr_pool_userdata_get call that you are
> relying on to be thread safe - no idea if it is, but at it is reading
> things it is more likely to be.
With the explicit mutex after the stores == null (and the appropriate
recheck), apr_pool_userdata_get would also be the only function that
needs to be thread safe. However, if that is not, then you would have to
put a mutex around that as well. As that is called on every request that
would be less nice having to do that. On the other hand, given the load
we see on typical mod_tile installations, that shouldn't be an issue
either. From the benchmarks I did on the locking on the stats
collection, even at 10k tiles/s the per request locking didn't seem to
have a significant effect.
I'll try and fix this tonight and hopefully that will then indeed solve
the instability issues Sven and Andy have seen.
Kai
>
>
> Tom
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20130522/12b5c52a/attachment-0001.html>
More information about the dev
mailing list