Re: [PATCH 10 of 20] ipath - support for userspace apps using coredriver

From: Hugh Dickins
Date: Fri Mar 17 2006 - 10:24:36 EST


On Fri, 17 Mar 2006, Nick Piggin wrote:
> Hugh Dickins wrote:
> >
> > Once __GFP_COMP is passed to the dma_alloc_coherent, as it needs to be
> > (unless going VM_PFNMAP), get_user_pages will be safe: no need for VM_IO.
>
> But it doesn't look like dma_alloc_coherent is guaranteed to return
> memory allocated from the regular page allocator, nor even memory
> backed by a struct page.

Hmm, that's bad news.

If it's not backed by a struct page, then I guess Bryan can't be
interested in mapping it into userspace with a .nopage; so perhaps that
case is already ruled out somehow, and we needn't worry about it. Or
should he just be using remap_pfn_range - is there a portable way to
use that with dma_alloc_coherent/pci_alloc_consistent?

I'm not a driver writer, and have no idea of these things: I think,
having got Bryan back on track with his page counts, I'd better step
aside, and let those who understand these things take him forward.

> For example, I see one that returns kmalloc()ed memory. If the pages
> for the slab are already allocated then __GFP_COMP will not do anything
> there. i386 looks like it has a path that uses ioremap...

I don't remember offhand whether passing __GFP_COMP to kmalloc does
something, nothing, errors out, or behaves erratically according to
whether the slab is already allocated.

I'd feel so much more confident if no __GFP_COMP flag were ever needed.
It seems that's how PageCompound started out, every >0-order was compound;
then some nasty code found down in ARM and some drivers that were screwed
by that, so __GFP_COMP introduced (and other trees had __GFP_NOCOMP).

Is there any chance that your split_page() work in -mm, actually addresses
precisely those places that were screwed up by universal compound pages?
So that with your split_page(), we could go back to every >0-order page
being PageCompound, without any need for __GFP_COMP.

There'd probably be a few blips to sort out, but if it seems a plausible
way forward, that's the way I'd like to go. But I wasn't in on the
early days of PageCompound, maybe Andrew remembers the issues. I've
appended a couple of akpm entries from ChangeLog-2.6.6 below, to help
jog memories (I'm amused to see how the compound page logic started
off using page->lru, where we've just now moved it back to).

> Now I haven't looked through all these closely like you will have, but
> I'd like to know how __GFP_COMP solves all the potential problems I
> see.

It seems I can't have looked as closely as I thought. I was advising
Bryan on the basis of the __GFP_COMP (I added) in snd_malloc_dev_pages,
which appears to have been working. But now I fear perhaps that was
just a rare case to support a single driver only needed on a few
architectures (not everyone exports snd_malloc_dev_pages memory into
userspace); or we have a nasty surprise in store for us there too.

Aside from the dark alleys of dma_alloc_coherent that you've mentioned,
there's the architectures which #define it to pci_alloc_consistent,
which takes no gfp_mask, so __GFP_COMP would be ignored (hence,
in part, my desire to make all >0-order pages compound).

None of what I've said above is much help to Bryan (nor is Linus'
suggestion that he allocates one page at a time, if dma_alloc_coherent
won't even give him the right kind of struct-page-backed memory):
but as I said, I'll have to step aside from pretending to advise
on what his driver should be doing.

Hugh

<akpm@xxxxxxxx>
[PATCH] stop using page->lru in compound pages

The compound page logic is using page->lru, and these get will scribbled on
in various places so switch the Compound page logic over to using ->mapping
and ->private.

<akpm@xxxxxxxx>
[PATCH] use compound pages for hugetlb pages only

The compound page logic is a little fragile - it relies on additional
metadata in the pageframes which some other kernel code likes to stomp on
(xfs was doing this).

Also, because we're treating all higher-order pages as compound pages it is
no longer possible to free individual lower-order pages from the middle of
higher-order pages. At least one ARM driver insists on doing this.

We only really need the compound page logic for higher-order pages which can
be mapped into user pagetables and placed under direct-io. This covers
hugetlb pages and, conceivably, soundcard DMA buffers which were allcoated
with a higher-order allocation but which weren't marked PageReserved.

The patch arranges for the hugetlb implications to allocate their pages with
compound page metadata, and all other higher-order allocations go back to the
old way.

(Andrea supplied the GFP_LEVEL_MASK fix)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/