Re: [PATCH] buildid: validate page-backed file before parsing build ID

From: Andrii Nakryiko

Date: Mon Jan 05 2026 - 17:54:46 EST


On Tue, Dec 30, 2025 at 2:11 PM David Hildenbrand (Red Hat)
<david@xxxxxxxxxx> wrote:
>
> On 12/23/25 18:29, Andrew Morton wrote:
> > On Tue, 23 Dec 2025 18:32:07 +0800 Jinchao Wang <wangjinchao600@xxxxxxxxx> wrote:
> >
> >> __build_id_parse() only works on page-backed storage. Its helper paths
> >> eventually call mapping->a_ops->read_folio(), so explicitly reject VMAs
> >> that do not map a regular file or lack valid address_space operations.
> >>
> >> Reported-by: syzbot+e008db2ac01e282550ee@xxxxxxxxxxxxxxxxxxxxxxxxx
> >> Signed-off-by: Jinchao Wang <wangjinchao600@xxxxxxxxx>
> >>
> >> ...
> >>
> >> --- a/lib/buildid.c
> >> +++ b/lib/buildid.c
> >> @@ -280,7 +280,10 @@ static int __build_id_parse(struct vm_area_struct *vma, unsigned char *build_id,
> >> int ret;
> >>
> >> /* only works for page backed storage */
> >> - if (!vma->vm_file)
> >> + if (!vma->vm_file ||
> >> + !S_ISREG(file_inode(vma->vm_file)->i_mode) ||
> >> + !vma->vm_file->f_mapping->a_ops ||
> >> + !vma->vm_file->f_mapping->a_ops->read_folio)
> >> return -EINVAL;
>
> Just wondering, we are fine with MAP_PRIVATE files, right? I guess it's
> not about the actual content in the VMA (which might be different for a
> MAP_PRIVATE VMA), but only about the content of the mapped file.

Yep, this code is fetching contents of a file that backs given VMA.

>
>
> LGTM, although I wonder whether some of these these checks should be
> exposed as part of the read_cache_folio()/do_read_cache_folio() API.
>
> Like, having a helper function that tells us whether we can use
> do_read_cache_folio() against a given mapping+file.

I agree, this seems to be leaking a lot of internal mm details into
higher-level caller (__build_id_parse). Right now we try to fetch
folio with filemap_get_folio() and if that succeeds, then we do
read_cache_folio. Would it be possible for filemap_get_folio() to
return error if the folio cannot be read using read_cache_folio()? Or
maybe have a variant of filemap_get_folio() that would have this
semantic?

>
> --
> Cheers
>
> David