Re: [PATCH 35/40] fscache: convert object to use workqueue insteadof slow-work
From: Tejun Heo
Date: Tue Feb 16 2010 - 18:44:41 EST
Hello,
On 02/17/2010 03:05 AM, David Howells wrote:
> Tejun Heo <tj@xxxxxxxxxx> wrote:
>> So, you're saying...
>>
>> * There can be a lot of concurrent shallow dependency chains, so
>> deadlocks can't realistically avoided by allowing larger number of
>> theads in the pool.
>
> Yes. As long as you can queue one more op than you can have threads, you can
> get deadlock between the queue and the threads.
>
>> * Such occurrences would be common enough that the 'yield' path would
>> be essential in keeping the operation going smooth.
>
> I've seen them a few times, usually under high pressure. I've got some evil
> test cases that try to read a few thousand sequences of files simultaneously.
Alright, thanks for clarifying. Yield it is then.
>> One problem I have with the slow work yield-on-queue mechanism is that
>> it may fit fscache well but generally doesn't make much sense. What
>> would make more sense would be yield-under-pressure (ie. thread limit
>> reached or about to be reached and new work queued). Would that work
>> for fscache?
>
> I'm not sure what you mean. Slow-work does do yield-under-pressure.
> slow_work_sleep_till_thread_needed() adds the waiting object to a waitqueue by
> which it can be interrupted by slow-work when slow-work wants its thread back.
>
> If the object execution is busy doing something rather than waiting around,
> there's no reason to yield the thread back.
The waiting workers are woken up in slow_work_enqueue() when a new
work is enqueued regardless of how many workers are currently in use,
right? So, it ends up yielding on any queue activity rather than only
under resource pressure.
>> It might but I wasn't sure whether this could actually be a problem
>> for what fscache is doing. Again, I just don't know what kind of
>> workload the code is expecting. The reason why I thought it might not
>> was because the default concurrency level was low.
>
> You can end up serialising together all the I/O being done by NFS, AFS and
> anything else using FS-Cache.
Yeap, if the workload can be highly parallel, SINGLE_CPU won't fare
off very well. Will take care of that.
>> Alright, so it can be very high. This is slightly off topic but isn't
>> the know a bit too low level to export? It will adjust concurrency
>> level of the whole slow-work facility which can be used by any number
>> of users.
>
> 'The know'?
Ah... sorry about that, the knob.
> One thing I was trying to do was avoid the workqueue problem of
> having a static pool of threads per workqueue. As CPU counts go up,
> that starts eating some serious resources. What I was trying for
> was one pool that was dynamically sized.
Yeah, sure. I was just wondering that it would be nice if the pool
can be dynamically sized without user specifying the concurrency level
directly. I have no idea that would be possible or not.
> Tuning such a pool is tricky, however; you have a set of conflicting usage
> patterns - hence the two thread priorities (slow and very slow).
...
>> BTW, if we solve the yielding problem (I think we can retain the
>> original behavior by implementing it inside fscache) and the
>> reentrance issue, do you see any other obstacles in switching to cmwq?
>
> I don't think so. I'm not sure how you retain the original yield
> behaviour by doing it inside FS-Cache - slow-work knows about the
> congestion, not FS-Cache.
I'm thinking about how to solve it but it will be solved one way or
the other. If it can't be done in fscache, I'll add some general hook
in wq code to tell fscache congestion conditions. I was curious
whether you see other issues in converting to cmwq other than those
two discussed so far. If not, I'll go ahead and prep the next round.
Thanks.
--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/