Re: [Regression v4.2 ?] 32-bit seccomp-BPF returned errno values wrong in VM?

From: David Drysdale
Date: Thu Aug 13 2015 - 13:40:02 EST


On Thu, Aug 13, 2015 at 6:15 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> On Thu, Aug 13, 2015 at 9:28 AM, David Drysdale <drysdale@xxxxxxxxxx> wrote:
>> On Thu, Aug 13, 2015 at 4:17 PM, Denys Vlasenko <dvlasenk@xxxxxxxxxx> wrote:
>>> On 08/13/2015 10:30 AM, David Drysdale wrote:
>>>> Hi folks,
>>>>
>>>> I've got an odd regression with the v4.2 rc kernel, and I wondered if anyone
>>>> else could reproduce it.
>>>>
>>>> The problem occurs with a seccomp-bpf filter program that's set up to return
>>>> an errno value -- an errno of 1 is always returned instead of what's in the
>>>> filter, plus other oddities (selftest output below).
>>>>
>>>> The problem seems to need a combination of circumstances to occur:
>>>>
>>>> - The seccomp-bpf userspace program needs to be 32-bit, running against a
>>>> 64-bit kernel -- I'm testing with seccomp_bpf from
>>>> tools/testing/selftests/seccomp/, built via 'CFLAGS=-m32 make'.
>>>
>>> Does it work correctly when built as 64-bit program?
>>
>> Yep, 64-bit works fine (both at v4.2-rc6 and at commit 3f5159).
>>
>>>>
>>>> - The kernel needs to be running as a VM guest -- it occurs inside my
>>>> VMware Fusion host, but not if I run on bare metal. Kees tells me he
>>>> cannot repro with a kvm guest though.
>>>>
>>>> Bisecting indicates that the commit that induces the problem is
>>>> 3f5159a9221f19b0, "x86/asm/entry/32: Update -ENOSYS handling to match the
>>>> 64-bit logic", included in all the v4.2-rc* candidates.
>>>>
>>>> Apologies if I've just got something odd with my local setup, but the
>>>> bisection was unequivocal enough that I thought it worth reporting...
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>
>>>> seccomp_bpf failure outputs:
>>
>> [snip]
>>
>>> End result should be:
>>> pt_regs->ax = -E2BIG (via syscall_set_return_value())
>>> pt_regs->orig_ax = -1 ("skip syscall")
>>> and syscall_trace_enter_phase1() usually returns with 0,
>>> meaning "re-execute syscall at once, no phase2 needed".
>>>
>>> This, in turn, is called from .S files, and when it returns there,
>>> execution loops back to syscall dispatch.
>>>
>>> Because of orig_ax = -1, syscall dispatch should skip calling syscall.
>>> So -E2BIG should survive and be returned...
>>
>> So I was just about to send:
>>
>> That makes sense, and given that exactly the same 32-bit binary
>> runs fine on a different machine, there's presumably something up
>> with my local setup. The failing machine is a VMware guest, but
>> maybe that's not the relevant interaction -- particularly if no-one
>> else can repro.
>>
>> But then I noticed some odd audit entries in the main log:
>>
>> Aug 13 16:52:56 ubuntu kernel: [ 20.687249] audit: type=1326
>> audit(1439481176.034:62): auid=4294967295 uid=1000 gid=1000
>> ses=4294967295 pid=2621 comm="secccomp_bpf.ke"
>> exe="/home/dmd/secccomp_bpf.kees.m32" sig=9 arch=40000003 syscall=172
>> compat=1 ip=0xf773cc90 code=0x0
>> Aug 13 16:52:56 ubuntu kernel: [ 20.691157] audit: type=1326
>> audit(1439481176.038:63): auid=4294967295 uid=1000 gid=1000
>> ses=4294967295 pid=2631 comm="secccomp_bpf.ke"
>> exe="/home/dmd/secccomp_bpf.kees.m32" sig=31 arch=40000003 syscall=20
>> compat=1 ip=0xf773cc90 code=0x10000000
>> ...
>>
>> I didn't think I had any audit stuff turned on, and indeed:
>> # auditctl -l
>> No rules
>>
>> But as soon as I'd run that auditctl command, the 32-bit
>> seccomp_bpf binary started running fine!
>>
>> So now I'm confused, and I can no longer reproduce the
>> problem. Which probably means this was a false alarm, in
>> which case, my apologies.
>
> You might have triggered TIF_AUDIT or whatever it's called, which
> causes a whole different path through the asm tangle, so you might
> really have a problem.
>
> Try auditctl -a task,never. If that doesn't change anything, try
> rebooting the guest.

Aha, that seems to re-instate the problem -- with that auditctl setup
I get the 32-bit seccomp failures on two different machines (one VM,
one bare). So can anyone else repro?

I guess the relevant steps are thus:
- sudo auditctl -a task,never
- cd tools/testing/selftests/seccomp
- CFLAGS=-m32 make clean run_tests
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/