Re: [PATCH v2 6/9] mm: Make hugetlb mappings go through mm_get_unmapped_area_vmflags

From: Lorenzo Stoakes
Date: Wed Jul 31 2024 - 12:16:27 EST


On Wed, Jul 31, 2024 at 06:04:04PM GMT, Oscar Salvador wrote:
> On Wed, Jul 31, 2024 at 04:19:09PM +0100, Lorenzo Stoakes wrote:
> > Yeah this is at commit aee8efc95fc2 ("mm: make hugetlb mappings go through
> > mm_get_unmapped_area_vmflags").
> >
> > If you:
> >
> > git checkout aee8efc95fc2
> > git grep hugetlb_get_unmapped_area
> >
> > You'll see it.
> >
> > I'm guessing you remove this in future commits, but the kernel must be able
> > to build at every revision so we can bisect (I found this issue through a
> > bisect and had to fix this up to check).
> >
> > A trivial fix is just to provide the prototype immediately prior to the
> > function decl, however the more correct solution is probably to do the
> > removals at the same time.
>
> Yeah, I just squashed the removal commit and this one.
>
> > This bit is just a bit of a slightly nitty cleanup to make sure things
> > build at every commit, the first issue is the really key one, just needs
> > some tweaking to deal with the frankly bloody horrible SHM stuff... Do not
> > blame you for missing that one!
>
> I did not check closely yet, but are blowing up in:
>
> if (shmem_huge != SHMEM_HUGE_FORCE) {
> ...
> if (file) {
> VM_BUG_ON(file->f_op != &shmem_file_operations)
>
> ?

I've not got the vm debug on in my build, so it's blowing up here for me:

static unsigned long shm_get_unmapped_area(struct file *file,
unsigned long addr, unsigned long len, unsigned long pgoff,
unsigned long flags)
{
struct shm_file_data *sfd = shm_file_data(file);

return sfd->file->f_op->get_unmapped_area(sfd->file, addr, len,
pgoff, flags);
}

Notice that that doesn't check whether sfd->file->f_op->get_unmapped_area
is NULL.

So since you remove this from the f_ops, it causes a NULL pointer deref.

In __get_unmapped_area() you have:

if (file) {
if (file->f_op->get_unmapped_area)
get_area = file->f_op->get_unmapped_area;

...

if (get_area) {
addr = get_area(file, addr, len, pgoff, flags);

Now since you are dealing with a shm file that has shm_file_operations

static const struct file_operations shm_file_operations = {
..
.get_unmapped_area = shm_get_unmapped_area,
...
};

Then this get_area() is invoked, which calls shm_get_unmapped_area(), which
calls f_op->get_unmapped_area() on your hugetlbfs_file_operations object
which you just deleted and it's NULL.

This is why you have to be super careful here, there's clearly stuff out
there that assumes that this can't happen, which you need to track down.

A quick grep however _suggests_ this might be the one landmine place. But
you need to find a smart way to deal with this.

>
>
> --
> Oscar Salvador
> SUSE Labs