Re: m(un)map kmalloc buffers to userspace

From: Michal Hocko
Date: Fri Dec 11 2015 - 04:42:15 EST


On Thu 10-12-15 17:48:31, Sebastian Frias wrote:
> On 12/10/2015 03:06 PM, Michal Hocko wrote:
> >On Thu 10-12-15 14:37:38, Sebastian Frias wrote:
> >>On 12/10/2015 12:40 PM, Michal Hocko wrote:
> >>>On Wed 09-12-15 16:35:53, Sebastian Frias wrote:
> >>>[...]
> >>>>We've seen that drivers/media/pci/zoran/zoran_driver.c for example seems to
> >>>>be doing as us kmalloc+remap_pfn_range,
> >>>
> >>>This driver is broken - I will post a patch.
> >>
> >>Ok, we'll be glad to see a good example, please keep us posted.
> >>
> >>>
> >>>>is there any guarantee (or at least an advised heuristic) to determine
> >>>>if a driver is "current" (ie: uses the latest APIs and works)?
> >>>
> >>>OK, it seems I was overly optimistic when directing you to existing
> >>>drivers. Sorry about that I wasn't aware you could find such a terrible
> >>>code there. Please refer to Linux Device Drivers book which should give
> >>>you a much better lead (e.g. http://www.makelinux.net/ldd3/chp-15-sect-2)
> >>>
> >>
> >>Thank you for the link.
> >>The current code of our driver was has portions written following LDD3,
> >>however, we it seems that LDD3 advice is not relevant anymore.
> >>Indeed, it talks about VM_RESERVED, it talks about using "nopage" and it
> >>says that remap_pfn_range cannot be used for pages from get_user_page (or
> >>kmalloc).
> >
> >Heh, it seems that we are indeed outdated there as well. The memory
> >management code doesn't really require pages to be reserved and it
> >allows to use get_user_page(s) memory to be mapped to user ptes.
> >remap_pfn_range will set all the appropriate flags to make sure MM code
> >will not stumble over those pages and let's the driver to take care of
> >the memory deallocation.
>
> Ok, just for information, do you know since when it is possible to use
> remap_pfn_range on kmalloc/get_user_page memory?

No from top of my head. But at least since 6aab341e0a28a (2.6.15)
remap_pfn_page sets PM_PFN which make vm_normal_page ignore those pages
in MM code.

> >>It seems such assertions are valid on older kernels, because the code stops
> >>working on 3.4+ if we use remap_pfn_range the same way than
> >>drivers/media/pci/zoran/zoran_driver.c
> >>However, kmalloc+remap_pfn_range does work on 4.1.13+
> >
> >As I've said nothing will guarantee that the kmalloc returned address
> >will be page aligned so you might corrupt slab internal data structures.
> >You might allocate a larger buffer via kmalloc and make sure it is
> >aligned properly but I fail to see why should be kmalloc used in the
> >first place as you need a memory in page size unnits anyway.
> >
>
> Ok, so let's say we stop using kmalloc in favor of __get_user_pages, do you
> see other things that would need to be done to be compliant with current
> practices?

I think this should just work.

> For instance, drivers/media/pci/zoran/zoran_driver.c is doing:
>
> for (off = 0; off < fh->buffers.buffer_size; off += PAGE_SIZE)
> SetPageReserved(virt_to_page(mem + off));
>
> on the memory allocated with kmalloc, but we are not doing any of that, yet
> it was working. Would the switch to __get_user_pages require the calls to
> SetPageReserved?

I do not see much point of setting pages reserved. MM should ignore them
based on the vma flags AFAICS via vm_normal_page. Quick check of
PageReserved usage in the mm code shows that we use it very rarely. It
would be really a bug when mm would touch such a page even without
PageReserved. So this seems like a historical heritage.

--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/