Re: [PATCH 08/17] mm: page_alloc: Use word-based accesses for get/set pageblock bitmaps

From: Vlastimil Babka
Date: Tue May 06 2014 - 10:42:28 EST


On 05/06/2014 11:13 AM, Mel Gorman wrote:
On Mon, May 05, 2014 at 02:40:38PM +0200, Vlastimil Babka wrote:
@@ -62,11 +65,35 @@ extern int pageblock_order;
/* Forward declaration */
struct page;

+unsigned long get_pageblock_flags_mask(struct page *page,
+ unsigned long end_bitidx,
+ unsigned long nr_flag_bits,
+ unsigned long mask);
+void set_pageblock_flags_mask(struct page *page,
+ unsigned long flags,
+ unsigned long end_bitidx,
+ unsigned long nr_flag_bits,
+ unsigned long mask);
+

The nr_flag_bits parameter is not used anymore and can be dropped.


Fixed

/* Declarations for getting and setting flags. See mm/page_alloc.c */
-unsigned long get_pageblock_flags_group(struct page *page,
- int start_bitidx, int end_bitidx);
-void set_pageblock_flags_group(struct page *page, unsigned long flags,
- int start_bitidx, int end_bitidx);
+static inline unsigned long get_pageblock_flags_group(struct page *page,
+ int start_bitidx, int end_bitidx)
+{
+ unsigned long nr_flag_bits = end_bitidx - start_bitidx + 1;
+ unsigned long mask = (1 << nr_flag_bits) - 1;
+
+ return get_pageblock_flags_mask(page, end_bitidx, nr_flag_bits, mask);
+}
+
+static inline void set_pageblock_flags_group(struct page *page,
+ unsigned long flags,
+ int start_bitidx, int end_bitidx)
+{
+ unsigned long nr_flag_bits = end_bitidx - start_bitidx + 1;
+ unsigned long mask = (1 << nr_flag_bits) - 1;
+
+ set_pageblock_flags_mask(page, flags, end_bitidx, nr_flag_bits, mask);
+}

#ifdef CONFIG_COMPACTION
#define get_pageblock_skip(page) \
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index dc123ff..f393b0e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6032,53 +6032,64 @@ static inline int pfn_to_bitidx(struct zone *zone, unsigned long pfn)
* @end_bitidx: The last bit of interest
* returns pageblock_bits flags
*/
-unsigned long get_pageblock_flags_group(struct page *page,
- int start_bitidx, int end_bitidx)
+unsigned long get_pageblock_flags_mask(struct page *page,
+ unsigned long end_bitidx,
+ unsigned long nr_flag_bits,
+ unsigned long mask)
{
struct zone *zone;
unsigned long *bitmap;
- unsigned long pfn, bitidx;
- unsigned long flags = 0;
- unsigned long value = 1;
+ unsigned long pfn, bitidx, word_bitidx;
+ unsigned long word;

zone = page_zone(page);
pfn = page_to_pfn(page);
bitmap = get_pageblock_bitmap(zone, pfn);
bitidx = pfn_to_bitidx(zone, pfn);
+ word_bitidx = bitidx / BITS_PER_LONG;
+ bitidx &= (BITS_PER_LONG-1);

- for (; start_bitidx <= end_bitidx; start_bitidx++, value <<= 1)
- if (test_bit(bitidx + start_bitidx, bitmap))
- flags |= value;
-
- return flags;
+ word = bitmap[word_bitidx];

I wonder if on some architecture this may result in inconsistent
word when racing with set(), i.e. cmpxchg? We need consistency at
least on the granularity of byte to prevent the problem with bogus
migratetype values being read.
fix:

The number of bits align on the byte boundary so I do not think there is
a problem there. There is a BUILD_BUG_ON check in set_pageblock_flags_mask
in case this changes so it can be revisited if necessary.

I was wondering about hardware guarantees in that case (e.g. consistency at least on the granularity of byte when a simple memory read races with write) but after some discussion in the office I understand that hardware without such guarantees wouldn't be able to run Linux anyway :)

Still I wonder if ACCESS_ONCE would be safer in the 'word' variable assignment to protect against compiler trying to be too smart?

Anyway with the nr_flag_bits removed:

Acked-by: Vlastimil Babka <vbabka@xxxxxxx>

+ bitidx += end_bitidx;
+ return (word >> (BITS_PER_LONG - bitidx - 1)) & mask;

Yes that looks correct to me, bits don't seem to overlap anymore.


Thanks.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/