Re: [RFC][PATCH 0/8] Critical Page Pool

From: Avi Kivity
Date: Fri Nov 18 2005 - 14:44:24 EST

Next message: Linus Torvalds: "Re: [patch 1/3] Add SCM info to MAINTAINERS"
Previous message: Matthew Dobson: "[RFC][PATCH 6/8] slab_destruct"
In reply to: Matthew Dobson: "[RFC][PATCH 6/8] slab_destruct"
Next in thread: Matthew Dobson: "Re: [RFC][PATCH 0/8] Critical Page Pool"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Matthew Dobson wrote:

We have a clustering product that needs to be able to guarantee that the
networking system won't stop functioning in the case of OOM/low memory
condition. The current mempool system is inadequate because to keep the
whole networking stack functioning, we need more than 1 or 2 slab caches to
be guaranteed. We need to guarantee that any request made with a specific
flag will succeed, assuming of course that you've made your "critical page
pool" big enough.

The following patch series implements such a critical page pool. It
creates 2 userspace triggers:

/proc/sys/vm/critical_pages: write the number of pages you want to reserve
for the critical pool into this file

/proc/sys/vm/in_emergency: write a non-zero value to tell the kernel that
the system is in an emergency state and authorize the kernel to dip into
the critical pool to satisfy critical allocations.

We mark critical allocations with the __GFP_CRITICAL flag, and when the
system is in an emergency state, we are allowed to delve into this pool to
satisfy __GFP_CRITICAL allocations that cannot be satisfied through the
normal means.

1. If you have two subsystems which allocate critical pages, how do you protect against the condition where one subsystem allocates all the critical memory, causing the second to oom?

2. There already exists a critical pool: ordinary allocations fail if free memory is below some limit, but special processes (kswapd) can allocate that memory by setting PF_MEMALLOC. Perhaps this should be extended, possibly with a per-process threshold.

--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Linus Torvalds: "Re: [patch 1/3] Add SCM info to MAINTAINERS"
Previous message: Matthew Dobson: "[RFC][PATCH 6/8] slab_destruct"
In reply to: Matthew Dobson: "[RFC][PATCH 6/8] slab_destruct"
Next in thread: Matthew Dobson: "Re: [RFC][PATCH 0/8] Critical Page Pool"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]