Re: [RFC v2 PATCH 0/8] mm: mirrored memory support for page buddy allocations

From: Mel Gorman
Date: Tue Jun 30 2015 - 07:54:12 EST

Next message: Michal Hocko: "WARNING: CPU: 0 PID: 3634 at drivers/gpu/drm/drm_irq.c:1141 drm_wait_one_vblank"
Previous message: Hanjun Guo: "Re: [PATCH v2 6/9] irqchip / gic: Add stacked irqdomain support for ACPI based GICv2 init"
In reply to: Ingo Molnar: "Re: [RFC v2 PATCH 0/8] mm: mirrored memory support for page buddy allocations"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, Jun 30, 2015 at 12:46:54PM +0200, Ingo Molnar wrote:
>
> * Mel Gorman <mgorman@xxxxxxx> wrote:
>
> > [...]
> >
> > Basically, overall I feel this series is the wrong approach but not knowing who
> > the users are making is much harder to judge. I strongly suspect that if
> > mirrored memory is to be properly used then it needs to be available before the
> > page allocator is even active. Once active, there needs to be controlled access
> > for allocation requests that are really critical to mirror and not just all
> > kernel allocations. None of that would use a MIGRATE_TYPE approach. It would be
> > alterations to the bootmem allocator and access to an explicit reserve that is
> > not accounted for as "free memory" and accessed via an explicit GFP flag.
>
> So I think the main goal is to avoid kernel crashes when a #MC memory fault
> arrives on a piece of memory that is owned by the kernel.
>

Sounds logical. In that case, bootmem awareness would be crucial.
Enabling support in just the page allocator is too late.

> In that sense 'protecting' all kernel allocations is natural: we don't know how to
> recover from faults that affect kernel memory.
>

It potentially uses all mirrored memory on memory that does not need that
sort of guarantee. For example, if there was a MC on memory backing the
inode cache then potentially that is recoverable as long as the inodes
were not dirty. That's a minor detail as the kernel could later protect
only MIGRATE_UNMOVABLE requests instead of all kernel allocations if fatal
MC in kernel space could be distinguished from non-fatal checks.

Bootmem awareness is much more important either way. If that was addressed
then potentially a MIGRATE_UNMOVABLE_MIRROR type could be created that
is only used for MIGRATE_UNMOVABLE allocations and never for user-space.
That misses MIGRATE_RECLAIMABLE so if that is required then we need
something else that both preserves fragmentation avoidance and avoid
introducing loads of new migratetypes.

Reclaim-related issues could be partially avoided by forbidding use from
userspace and accounting for the size of MIGRATE_UNMOVABLE_MIRROR during
watermark checks.

> We do know how to recover from faults that affect user-space memory alone.
>
> So if a mechanism is in place that prioritizes 3 groups of allocators:
>
> - non-recoverable memory (kernel allocations mostly)
>

So bootmem at the very least followed by MIGRATE_UNMOVABLE requests whether
they are accounted for by zones of MIGRATE_TYPES.

> - high priority user memory (critical apps that must never fail)
>

This one is problematic with a MIGRATE_TYPE-based approach such as the one in
this series. If a high priority requires memory and MIGRATE_MIRROR is full
then some of it must be reclaimed. With a MIGRATE_TYPE approach, the kernel
may reclaim a lot of unnecessary memory trying to free some MIGRATE_MIRROR
memory with no guarantee of success. It'll look like unnecessary thrashing
from userspace but difficult to diagnose as reclaim stats are per-zone based.
Dealing with this needs either a zone-based approach or a lot of surgery
to reclaim (similar to what the node-based LRU series does actually when
it skips pages when the caller requires lowmem pages).

--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Michal Hocko: "WARNING: CPU: 0 PID: 3634 at drivers/gpu/drm/drm_irq.c:1141 drm_wait_one_vblank"
Previous message: Hanjun Guo: "Re: [PATCH v2 6/9] irqchip / gic: Add stacked irqdomain support for ACPI based GICv2 init"
In reply to: Ingo Molnar: "Re: [RFC v2 PATCH 0/8] mm: mirrored memory support for page buddy allocations"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]