Re: [PATCH v3 3/5] mm/hugetlb: fix getting refcount 0 page in hugetlb_fault()

From: Hugh Dickins
Date: Tue Sep 30 2014 - 00:54:11 EST


On Mon, 15 Sep 2014, Naoya Horiguchi wrote:
> When running the test which causes the race as shown in the previous patch,
> we can hit the BUG "get_page() on refcount 0 page" in hugetlb_fault().

Two minor comments...

> @@ -3192,22 +3208,19 @@ int hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> * Note that locking order is always pagecache_page -> page,
> * so no worry about deadlock.

That sentence of comment is stale and should be deleted,
now that you're only doing a trylock_page(page) here.

> out_mutex:
> mutex_unlock(&htlb_fault_mutex_table[hash]);
> + if (need_wait_lock)
> + wait_on_page_locked(page);
> return ret;
> }

It will be hard to trigger any problem from this (I guess it would
need memory hotremove), but you ought really to hold a reference to
page while doing a wait_on_page_locked(page).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/