Re: [PATCH] mm/vmscan: fix highidx argument type

From: Vlastimil Babka
Date: Fri Jan 16 2015 - 13:54:00 EST


On 01/16/2015 08:07 AM, Michael S. Tsirkin wrote:
> On Thu, Jan 15, 2015 at 02:49:20PM -0800, Andrew Morton wrote:
>> On Fri, 16 Jan 2015 00:18:12 +0200 "Michael S. Tsirkin" <mst@xxxxxxxxxx> wrote:
>>
>> > for_each_zone_zonelist_nodemask wants an enum zone_type
>> > argument, but is passed gfp_t:
>> >
>> > mm/vmscan.c:2658:9: expected int enum zone_type [signed] highest_zoneidx
>> > mm/vmscan.c:2658:9: got restricted gfp_t [usertype] gfp_mask
>> > mm/vmscan.c:2658:9: warning: incorrect type in argument 2 (different base types)
>> > mm/vmscan.c:2658:9: expected int enum zone_type [signed] highest_zoneidx
>> > mm/vmscan.c:2658:9: got restricted gfp_t [usertype] gfp_mask
>>
>> Which tool emitted these warnings?
>
> Oh, sorry.
> It's sparce.
>
>> > convert argument to the correct type.
>> >
>> > ...
>> >
>> > --- a/mm/vmscan.c
>> > +++ b/mm/vmscan.c
>> > @@ -2656,7 +2656,7 @@ static bool throttle_direct_reclaim(gfp_t gfp_mask, struct zonelist *zonelist,
>> > * should make reasonable progress.
>> > */
>> > for_each_zone_zonelist_nodemask(zone, z, zonelist,
>> > - gfp_mask, nodemask) {
>> > + gfp_zone(gfp_mask), nodemask) {
>> > if (zone_idx(zone) > ZONE_NORMAL)
>> > continue;
>>
>> hm, I wonder what the runtime effects are.

So this was introduced by 675becce15f "mm: vmscan: do not throttle based on
pfmemalloc reserves if node has no ZONE_NORMAL" in 3.15. AFAICS gfp_mask >=
gfp_zone(gfp_mask), so the high_zoneidx will be higher than it should, and
next_zones_zonelist() won't filter the higher-than-wanted zones as it should.

I guess the runtime effects is that allocations for zone_type < NORMAL, i.e.
DMA32 or DMA, can now wrongly choose a numa node without such zones, for
checking pfmemalloc reserves and throttling. Which means the throttling can be
ineffective, or it could also throttle without actually needing to, if the wrong
zone has lower reserves? Mel?

>> The throttle_direct_reclaim() comment isn't really accurate, is it?
>> "Throttle direct reclaimers if backing storage is backed by the
>> network". The code is applicable to all types of backing, but was
>> added to address problems which are mainly observed with network
>> backing?

I guess. I also don't see any code restricting this just for network.

>
>
> As far as I can tell, yes. It would seem that it can cause
> deadlocks in theory. Cc stable on the grounds that it's obvious?

I don't think this mistake can introduce deadlocks on its own, but it also won't
prevent any problems that the throttling was suppsoed to prevent.
I agree it should go stable.

BTW, I wonder if the whole code couldn't be much simpler by capping high_zoneidx
by ZONE_NORMAL before traversing the zonelist, like this:

int high_zoneidx = min(gfp_zone(gfp_mask), ZONE_NORMAL);

first_zones_zonelist(zonelist, high_zoneidx, NULL, &zone);
pgdat = zone->zone_pgdat;

if (!pgdat || pfmemalloc_watermark_ok(pgdat))
goto out;


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/