Re: The performance and behaviour of the anti-fragmentation relatedpatches

From: Andrew Morton
Date: Fri Mar 02 2007 - 19:26:04 EST


On Fri, 02 Mar 2007 15:28:43 -0800
"Martin J. Bligh" <mbligh@xxxxxxxxxx> wrote:

> >>> 32GB is pretty much the minimum size to reproduce some of these
> >>> problems. Some workloads may need larger systems to easily trigger
> >>> them.
> >>
> >> We can find a 32GB system here pretty easily to test things on if
> >> need be. Setting up large commercial databases is much harder.
> >
> > That's my problem, too.
> >
> > There does not seem to exist any single set of test cases that
> > accurately predicts how the VM will behave with customer
> > workloads.
>
> Tracing might help? Showing Andrew traces of what happened in
> production for the prev_priority change made it much easier to
> demonstrate and explain the real problem ...
>

Tracing is one way.

The other way is the old scientific method:

- develop a theory
- add sufficient instrumentation to prove or disprove that theory
- run workload, crunch on numbers
- repeat

Of course, multiple theories can be proven/disproven in a single pass.

Practically, this means adding one new /prov/vmstat entry for each `goto
keep*' in shrink_page_list(). And more instrumentation in
shrink_active_list() to determine the behaviour of swap_tendency.

Once that process is finished, we should have a thorough understanding of
what the problem is. We can then construct a testcase (it'll be a couple
hundred lines only) and use that testcase to determine what implementation
changes are needed, and whether it actually worked.

Then go back to the real workload, verify that it's still fixed.

Then do whitebox testing of other workloads to check that they haven't
regressed.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/