Re: [PATCH bpf] bpf: verifier: prevent userspace memory access
From: Alexei Starovoitov
Date: Mon Mar 25 2024 - 00:44:32 EST
On Sun, Mar 24, 2024 at 3:30 PM David Laight <David.Laight@xxxxxxxxxx> wrote:
>
> From: Alexei Starovoitov
> > Sent: 24 March 2024 20:43
> >
> > On Sun, Mar 24, 2024 at 1:05 PM David Laight <David.Laight@xxxxxxxxxx> wrote:
> > >
> > > From: Alexei Starovoitov
> > > > Sent: 21 March 2024 06:08
> > > >
> > > > On Wed, Mar 20, 2024 at 3:55 AM Puranjay Mohan <puranjay12@xxxxxxxxx> wrote:
> > > > >
> > > > > The JITs need to implement bpf_arch_uaddress_limit() to define where
> > > > > the userspace addresses end for that architecture or TASK_SIZE is taken
> > > > > as default.
> > > > >
> > > > > The implementation is as follows:
> > > > >
> > > > > REG_AX = SRC_REG
> > > > > if(offset)
> > > > > REG_AX += offset;
> > > > > REG_AX >>= 32;
> > > > > if (REG_AX <= (uaddress_limit >> 32))
> > > > > DST_REG = 0;
> > > > > else
> > > > > DST_REG = *(size *)(SRC_REG + offset);
> > > >
> > > > The patch looks good, but it seems to be causing s390 CI failures.
> > >
> > > I'm confused by the need for this check (and, IIRC, some other bpf
> > > code that does kernel copies that can fault - and return an error).
> > >
> > > I though that the entire point of bpf was that is sanitised and
> > > verified everything to limit what the 'program' could do in order
> > > to stop it overwriting (or even reading) kernel structures that
> > > is wasn't supposed to access.
> > >
> > > So it just shouldn't have a address that might be (in any way)
> > > invalid.
> >
> > bpf tracing progs can call bpf_probe_read_kernel() which
> > can read any kernel memory.
> > This is nothing but an inlined version of it.
>
> It was the getsockopt() code were I saw the copy_nocheck() calls.
> Those have to be broken.
No. If you mean csum_partial_copy_nocheck() then they're fine.
> Although the way some of the options use the ptr:len supplied by
> the application you stand no chance of do an in-kernel call
> without a proper buffer descriptor argument (with separate optlen
> and bufferlen fields.)
>
> >
> > > The only possible address verify is access_ok() to ensure that
> > > a uses address really is a user address.
> >
> > access_ok() considerations don't apply.
> > We're not dealing with user memory access.
>
> If you do need a check for 'not a user address' don't you want to just
> require access_ok() fail?
> That would be architecture independent.
No. access_ok() can only be used on the user addr.
access_ok() == false on the kernel addr doesn't mean that
it's a kernel addr.