Re: [PATCH 1/1] kasan: fix livelock in qlist_move_cache

From: Dmitry Vyukov
Date: Tue Nov 28 2017 - 04:27:57 EST

Next message: Ching Huang: "[PATCH v2 1/3] scsi: arcmsr: Add driver module parameter msi_enable"
Previous message: Julia Lawall: "Re: omapfb/dss: Delete an error message for a failed memory allocation in three functions"
In reply to: Zhouyi Zhou: "Re: [PATCH 1/1] kasan: fix livelock in qlist_move_cache"
Next in thread: Zhouyi Zhou: "Re: [PATCH 1/1] kasan: fix livelock in qlist_move_cache"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, Nov 28, 2017 at 10:17 AM, Zhouyi Zhou <zhouzhouyi@xxxxxxxxx> wrote:
> Hi,
> Imagine all of the QUARANTINE_BATCHES elements of
> global_quarantine array is of size 4MB + 1MB, now a new call
> to quarantine_put is invoked, one of the element will be of size 4MB +
> 1MB + 1MB, so on and on.

I see what you mean. Does it really happen in your case? What's the
maximum batch size that you get during your workload?

I always wondered why don't we drain quarantine right in
quarantine_put when we overflow it? We already take quarantine_lock
and calling cache_free should be fine in that context, since user code
already does that.

> On Tue, Nov 28, 2017 at 4:58 PM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>> On Tue, Nov 28, 2017 at 9:33 AM, Zhouyi Zhou <zhouzhouyi@xxxxxxxxx> wrote:
>>> Hi,
>>> Please take a look at function quarantine_put, I don't think following
>>> code will limit the batch size below quarantine_batch_size. It only advance
>>> quarantine_tail after qlist_move_all.
>>>
>>> qlist_move_all(q, &temp);
>>>
>>> spin_lock(&quarantine_lock);
>>> WRITE_ONCE(quarantine_size, quarantine_size + temp.bytes);
>>> qlist_move_all(&temp, &global_quarantine[quarantine_tail]);
>>> if (global_quarantine[quarantine_tail].bytes >=
>>> READ_ONCE(quarantine_batch_size)) {
>>> int new_tail;
>>>
>>> new_tail = quarantine_tail + 1;
>>> if (new_tail == QUARANTINE_BATCHES)
>>> new_tail = 0;
>>> if (new_tail != quarantine_head)
>>> quarantine_tail = new_tail;
>>
>>
>> As far as I see this code can exceed global quarantine batch size by
>> at most 1 per-cpu batch. Per-cpu batch is caped at 1MB. So max global
>> batch size will be 4MB+1MB. Which does not radically change situation.
>>
>>
>>> On Tue, Nov 28, 2017 at 4:12 PM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>>>> On Tue, Nov 28, 2017 at 9:00 AM, Zhouyi Zhou <zhouzhouyi@xxxxxxxxx> wrote:
>>>>> Thanks for reviewing
>>>>> My machine has 128G of RAM, and runs many KVM virtual machines.
>>>>> libvirtd always
>>>>> report "internal error: received hangup / error event on socket" under
>>>>> heavy memory load.
>>>>> Then I use perf top -g, qlist_move_cache consumes 100% cpu for
>>>>> several minutes.
>>>>
>>>> For 128GB of RAM, batch size is 4MB. Processing such batch should not
>>>> take more than few ms. So I am still struggling to understand how/why
>>>> your change helps and why there are issues in the first place...
>>>>
>>>>
>>>>
>>>>> On Tue, Nov 28, 2017 at 3:45 PM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>>>>>> On Tue, Nov 28, 2017 at 5:05 AM, Zhouyi Zhou <zhouzhouyi@xxxxxxxxx> wrote:
>>>>>>> When there are huge amount of quarantined cache allocates in system,
>>>>>>> number of entries in global_quarantine[i] will be great. Meanwhile,
>>>>>>> there is no relax in while loop in function qlist_move_cache which
>>>>>>> hold quarantine_lock. As a result, some userspace programs for example
>>>>>>> libvirt will complain.
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> The QUARANTINE_BATCHES thing was supposed to fix this problem, see
>>>>>> quarantine_remove_cache() function.
>>>>>> What is the amount of RAM and number of CPUs in your system?
>>>>>> If system has 4GB of RAM, quarantine size is 128MB and that's split
>>>>>> into 1024 batches. Batch size is 128KB. Even if that's filled with the
>>>>>> smallest objects of size 32, that's only 4K objects. And there is a
>>>>>> cond_resched() between processing of every batch.
>>>>>> I don't understand why it causes problems in your setup. We use KASAN
>>>>>> extremely heavily on hundreds of machines 24x7 and we have not seen
>>>>>> any single report from this code...
>>>>>>
>>>>>>
>>>>>>> On Tue, Nov 28, 2017 at 12:04 PM, <zhouzhouyi@xxxxxxxxx> wrote:
>>>>>>>> From: Zhouyi Zhou <zhouzhouyi@xxxxxxxxx>
>>>>>>>>
>>>>>>>> This patch fix livelock by conditionally release cpu to let others
>>>>>>>> has a chance to run.
>>>>>>>>
>>>>>>>> Tested on x86_64.
>>>>>>>> Signed-off-by: Zhouyi Zhou <zhouzhouyi@xxxxxxxxx>
>>>>>>>> ---
>>>>>>>> mm/kasan/quarantine.c | 12 +++++++++++-
>>>>>>>> 1 file changed, 11 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
>>>>>>>> index 3a8ddf8..33eeff4 100644
>>>>>>>> --- a/mm/kasan/quarantine.c
>>>>>>>> +++ b/mm/kasan/quarantine.c
>>>>>>>> @@ -265,10 +265,13 @@ static void qlist_move_cache(struct qlist_head *from,
>>>>>>>> struct kmem_cache *cache)
>>>>>>>> {
>>>>>>>> struct qlist_node *curr;
>>>>>>>> + struct qlist_head tmp_head;
>>>>>>>> + unsigned long flags;
>>>>>>>>
>>>>>>>> if (unlikely(qlist_empty(from)))
>>>>>>>> return;
>>>>>>>>
>>>>>>>> + qlist_init(&tmp_head);
>>>>>>>> curr = from->head;
>>>>>>>> qlist_init(from);
>>>>>>>> while (curr) {
>>>>>>>> @@ -278,10 +281,17 @@ static void qlist_move_cache(struct qlist_head *from,
>>>>>>>> if (obj_cache == cache)
>>>>>>>> qlist_put(to, curr, obj_cache->size);
>>>>>>>> else
>>>>>>>> - qlist_put(from, curr, obj_cache->size);
>>>>>>>> + qlist_put(&tmp_head, curr, obj_cache->size);
>>>>>>>>
>>>>>>>> curr = next;
>>>>>>>> +
>>>>>>>> + if (need_resched()) {
>>>>>>>> + spin_unlock_irqrestore(&quarantine_lock, flags);
>>>>>>>> + cond_resched();
>>>>>>>> + spin_lock_irqsave(&quarantine_lock, flags);
>>>>>>>> + }
>>>>>>>> }
>>>>>>>> + qlist_move_all(&tmp_head, from);
>>>>>>>> }
>>>>>>>>
>>>>>>>> static void per_cpu_remove_cache(void *arg)
>>>>>>>> --
>>>>>>>> 2.1.4
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google Groups "kasan-dev" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+unsubscribe@xxxxxxxxxxxxxxxxx
>>>>>>> To post to this group, send email to kasan-dev@xxxxxxxxxxxxxxxxx
>>>>>>> To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/CAABZP2zEup53ZcNKOEUEMx_aRMLONZdYCLd7s5J4DLTccPxC-A%40mail.gmail.com.
>>>>>>> For more options, visit https://groups.google.com/d/optout.

Next message: Ching Huang: "[PATCH v2 1/3] scsi: arcmsr: Add driver module parameter msi_enable"
Previous message: Julia Lawall: "Re: omapfb/dss: Delete an error message for a failed memory allocation in three functions"
In reply to: Zhouyi Zhou: "Re: [PATCH 1/1] kasan: fix livelock in qlist_move_cache"
Next in thread: Zhouyi Zhou: "Re: [PATCH 1/1] kasan: fix livelock in qlist_move_cache"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]