Re: [PATCH V2] mm/gup: Clear the LRU flag of a page before adding to LRU batch

From: Ge Yang
Date: Tue Jul 30 2024 - 06:03:13 EST




在 2024/7/30 17:58, David Hildenbrand 写道:
On 30.07.24 11:56, Ge Yang wrote:


在 2024/7/30 17:41, David Hildenbrand 写道:
On 30.07.24 11:36, Ge Yang wrote:


在 2024/7/30 15:45, David Hildenbrand 写道:
Looking at this in more detail, I wonder if we can turn that to

if (!folio_test_clear_lru(folio))
        return;
folio_get(folio);

In all cases? The caller must hold a reference, so this should be
fine.


Seems the caller madvise_free_pte_range(...), calling
folio_mark_lazyfree(...), doesn't hold a reference on folio.


If that would be the case and the folio could get freed concurrently,
the folio_get(folio) would be completely broken.

In madvise_free_pte_range() we hold the PTL, so the folio cannot get
freed concurrently.


Right.

folio_get() is only allowed when we are sure the folio cannot get freed
concurrently, because we know there is a reference that cannot go away.



When cpu0 runs folio_activate(), and cpu1 runs folio_put() concurrently,
a possible bad scenario would like:

cpu0                                           cpu1

                                              folio_put_testzero(folio)
if (!folio_test_clear_lru(folio))// Seems folio shouldn't be accessed

          return;
folio_get(folio);
                                               __folio_put(folio)
                                               __folio_clear_lru(folio)


Seems we should use folio_try_get(folio) instead of folio_get(folio).

In which case is folio_activate() called without the PTL on a mapped
page or without a raised refcount?


No such case has been found. But, folio_put() can be run at anytime, so
folio_activate() may access a folio with a reference count of 0.

If you can't find such a case then nothing is broken and no switch to folio_try_get() is required.


Ok, thanks.