RE: [syzbot] [mm?] general protection fault in dequeue_hugetlb_folio_nodemask (2)

From: Kasireddy, Vivek
Date: Wed Jun 12 2024 - 01:55:41 EST


Hi Oscar,

>
> On Tue, Jun 11, 2024 at 07:52:06PM +0200, Oscar Salvador wrote:
> > On Tue, Jun 11, 2024 at 07:46:33PM +0200, Oscar Salvador wrote:
> > > On Tue, Jun 11, 2024 at 10:30:05AM -0700, Andrew Morton wrote:
> > > > On Tue, 11 Jun 2024 03:34:25 -0700 syzbot
> <syzbot+569ed13f4054f271087b@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > syzbot found the following issue on:
> > > >
> > > > Thanks.
> > > >
> > > > > Call Trace:
> > > > > <TASK>
> > > > > alloc_hugetlb_folio_nodemask+0xae/0x3f0 mm/hugetlb.c:2603
> > > > > memfd_alloc_folio+0x15e/0x390 mm/memfd.c:75
> > > > > memfd_pin_folios+0x1066/0x1720 mm/gup.c:3864
> > > > > udmabuf_create+0x658/0x11c0 drivers/dma-buf/udmabuf.c:353
> > > > > udmabuf_ioctl_create drivers/dma-buf/udmabuf.c:420 [inline]
> > > > > udmabuf_ioctl+0x304/0x4f0 drivers/dma-buf/udmabuf.c:451
> > > > > vfs_ioctl fs/ioctl.c:51 [inline]
> > > > > __do_sys_ioctl fs/ioctl.c:907 [inline]
> > > > > __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:893
> > > > > do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> > > > > do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
> > > > > entry_SYSCALL_64_after_hwframe+0x77/0x7f
> > > >
> > > > I think we can pretty confidently point at the series "mm/gup:
> > > > Introduce memfd_pin_folios() for pinning memfd folios". I'll drop the
> > > > v14 series.
> > >
> > > jfyi: I am trying to reproduce this locally.
> >
> > Actually, should not memfd_alloc_folio() pass htlb_alloc_mask() instead
> > of GFP_USER to alloc_hugetlb_folio_nodemask? Or at least do
> > GFP_HIGHUSER.
>
> Ok, I spot the issue.
> memfd_alloc_folio() was calling alloc_hugetlb_folio_nodemask with
> preferred_nid being NUMA_NO_NODE, but that is bad as
> dequeue_hugetlb_folio_nodemask will do:
>
> zonelist = node_zonelist(nid, gfp_mask)
>
> which will try to get node_zonelists from nid, but since nid is -1, heh.
>
> The below patch fixes the issue for me, but I think that the right place
> to fix this up would be alloc_hugetlb_folio_nodemask(), so we can place
> the numa_node_id() if preferred_nid = NUMA_NO_NODE in there as a safety
> net.
> This way we catch this before exploding in case the user was not careful
> enough.
>
> I will cook up a patch shortly.
Thank you for fixing this issue!

>
> Another thing is why memfd_alloc_folio uses GFP_USER instead of
> GFP_HIGHUSER, but that maybe because I see that memfd_pin_folios() is
> used by some DMA driver which might not have access to HIGH_MEMORY.
Right, memfd_pin_folios() is used by udmabuf driver for DMA but the reason
why GFP_USER is chosen is because I was following this code in gup.c:
struct migration_target_control mtc = {
.nid = NUMA_NO_NODE,
.gfp_mask = GFP_USER | __GFP_NOWARN,
.reason = MR_LONGTERM_PIN,
};

if (migrate_pages(movable_folio_list, alloc_migration_target,
NULL, (unsigned long)&mtc, MIGRATE_SYNC,
MR_LONGTERM_PIN, NULL)) {

where, alloc_migration_target() does the following to allocate a hugetlb folio:
if (folio_test_hugetlb(src)) {
struct hstate *h = folio_hstate(src);

gfp_mask = htlb_modify_alloc_mask(h, gfp_mask);
return alloc_hugetlb_folio_nodemask(h, nid,
mtc->nmask, gfp_mask,
htlb_allow_alloc_fallback(mtc->reason));

but I somehow missed the early check in alloc_migration_target() where it does:
if (nid == NUMA_NO_NODE)
nid = folio_nid(src);

Thanks,
Vivek

>
> diff --git a/mm/memfd.c b/mm/memfd.c
> index 8035c6325e3c..2692f0298adc 100644
> --- a/mm/memfd.c
> +++ b/mm/memfd.c
> @@ -68,12 +68,13 @@ static void memfd_tag_pins(struct xa_state *xas)
> struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx)
> {
> #ifdef CONFIG_HUGETLB_PAGE
> + int nid = numa_node_id();
> struct folio *folio;
> int err;
>
> if (is_file_hugepages(memfd)) {
> folio = alloc_hugetlb_folio_nodemask(hstate_file(memfd),
> - NUMA_NO_NODE,
> + nid,
> NULL,
> GFP_USER,
> false);
>
> >
> >
> > --
> > Oscar Salvador
> > SUSE Labs
> >
>
> --
> Oscar Salvador
> SUSE Labs