Re: [PATCH] Avoiding fragmentation through different allocator

From: Mel Gorman
Date: Tue Jan 25 2005 - 11:18:23 EST


On Tue, 25 Jan 2005, Andi Kleen wrote:

> On Tue, Jan 25, 2005 at 09:02:34AM -0500, Mukker, Atul wrote:
> >
> > > e.g. performance on megaraid controllers (very popular
> > > because a big PC vendor ships them) was always quite bad on
> > > Linux. Up to the point that specific IO workloads run half as
> > > fast on a megaraid compared to other controllers. I heard
> > > they do work better on Windows.
> > >
> > <snip>
> > > Ideally the Linux IO patterns would look similar to the
> > > Windows IO patterns, then we could reuse all the
> > > optimizations the controller vendors did for Windows :)
> >
> > LSI would leave no stone unturned to make the performance better for
> > megaraid controllers under Linux. If you have some hard data in relation to
> > comparison of performance for adapters from other vendors, please share with
> > us. We would definitely strive to better it.
>
> Sorry for being vague on this. I don't have much hard data on this,
> just telling an annecdote. The issue we saw was over a year ago
> and on a machine running an IO intensive multi process stress test
> (I believe it was an AIM7 variant with some tweaked workfile). When the test
> was moved to a machine with megaraid controller it ran significantly
> lower, compared to the old setup with a non RAID SCSI controller from
> a different vendor. I unfortunately don't know anymore the exact
> type/firmware revision etc. of the megaraid that showed the problem.
>

Ok, for me here, the bottom line is that decent hardware will not benefit
from help from the allocator. Worse, if the work required to provide
adjacent pages is high, it will even adversly affect throughput. I know as
well that to have physically contiguous pages in userspace would involve a
fair amount of overhead so even if we devise a system for providing them,
it would need to be a configurable option.

I will keep an eye out for a means of granting physically contiguous pages
for userspace in a lightweight manner but I'm going to focus on general
availability of large pages for TLBs, extend the system for a pool of
zero'd pages and how it can be adapted to help out the hotplug folks.

The system I have in mind for contiguous pages for userspace right now is
to extend the allocator API so that prefaulting and readahead will request
blocks of pages for userspace rather than a series of order-0 pages. So,
if we prefault 32 pages ahead, the allocator would have a new API that
would return 32 pages that are physically contiguous. That, in combination
with forced IOMMU may show if Contiguous Pages For IO is worth it or not.

This will take a while as I'll have to develop some mechanism for
measuring it while I'm at it and I only do this 2 days a week so it'll
take a while.

--
Mel Gorman
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/