Re: [PATCH 1/1] mm/page_alloc: add scheduling point to free_unref_page_list

From: wangjianxing
Date: Wed Mar 09 2022 - 21:51:20 EST


spin_lock will preempt_disable(), interrupt context will __irq_enter/local_bh_disable and also add preempt count with offset.

cond_resched check whether if preempt_count == 0 in first and won't schedule in previous context.

Is this right?


With another way, could we add some condition to avoid call cond_resched in interrupt context or spin_lock()?

+ if (preemptible())
+       cond_resched();

On 03/10/2022 09:05 AM, Andrew Morton wrote:
On Tue, 8 Mar 2022 16:19:33 +0000 Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:

On Tue, Mar 01, 2022 at 08:38:25PM -0500, wangjianxing wrote:
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3589febc6..1b96421c8 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3479,6 +3479,9 @@ void free_unref_page_list(struct list_head *list)
*/
if (++batch_count == SWAP_CLUSTER_MAX) {
local_unlock_irqrestore(&pagesets.lock, flags);
+
+ cond_resched();
This isn't safe. This path can be called from interrupt context
(otherwise we'd be using local_unlock_irq() instead of irqrestore()).
What a shame it is that we don't document our interfaces :(

I can't immediately find such callers, but I could imagine
put_pages_list() (which didn't document its interface this way either)
being called from IRQ.

And drivers/iommu/dma-iommu.c:fq_ring_free() calls put_pages_list()
from inside spin_lock().