Re: WARNING: bad unlock balance in xfs_iunlock

From: Eric Sandeen
Date: Mon Apr 30 2018 - 11:14:37 EST


On 4/30/18 9:02 AM, Dmitry Vyukov wrote:
> On Mon, Apr 30, 2018 at 3:49 PM, Eric Sandeen <sandeen@xxxxxxxxxxx> wrote:

...

>>> It just extracted kernel source file name that looked relevant
>>> to this crash and run get_maintainers.pl on it.
>>> Also the image can contain dynamically generated data, which makes it
>>> impossible to have as a file at all.
>>
>> I guess I'm not sure what this means, can you explain?
>
> Say, a value that we generally pass to close system call is not static
> and can't be dumped to a static file. It's whatever a previous open
> system call has returned. Inside of the program we memorize the return
> value of open in a variable and then pass it to close. This generally
> stands for all system calls. Say, an image can contain an uid, and
> that uid can be obtained from a system call too.

Ok, but that's the syscall side. You are operating on a static xfs image,
correct? We're only asking for the actual filesystem you're operating
against.

(When I say "image" I am talking only about the filesystem itself, not any
other syzkaller state)

...

>> That was not at all clear to me. I thought when syzkaller was telling us
>> "on upstream commit XYZ," it meant that it had identified commit XYZ as bad.
>> I'm not sure if anyone else made that mistake, but perhaps you could also clarify
>> the bug report text in this regard?
>
> Suggestions are welcome. Currently it says "syzbot hit the following
> crash on upstream commit SHA1", which was supposed to mean just the
> state of the source tree when the crash happened. But I am not a
> native speaker, so perhaps I am saying not what I intend to say.
>
> There are also suggestions on report format improvement from +Ted
> currently in works:
> https://github.com/google/syzkaller/issues/565#issuecomment-380792942
> Not sure if they make this distinction 100% clear, though.

Maybe I was the only one who misunderstood, but something like

git tree: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
HEAD: f5c754d63d06 mm/swap_state.c: make bool enable_vma_readahead and swap_vma_readahead()

to make it clear that it has not identified that commit as the culprit, it's
just the head of the tree you were testing? (I think I have the correct git
nomenclature ...)

...

>> If the base image only has one allocation group, it makes it more difficult for
>> some tools to work with the image, because there is no redundancy. 1 AG is
>> not a supported or recommended geometry for any real-life use of xfs.
>>
>> If I am correct that you start with a base image w/ a certain geometry or
>> set of mkfs options, starting with >= 2 AGs would improve the usefulness of the
>> filesystem image.
>
> syzkaller can generate/mutate images based on structured format
> templates, but for now we don't have any templates and these are just
> opaque blobs.

Ok, backing up more: When you are testing against an xfs filesystem image, where
does that image come from? How is it generated? A quick look at the syzkaller
tree didn't make that clear to me.

the xfs.repro file you provided at
https://drive.google.com/file/d/1jzhGGe5SBJcqfsjxCLHoh4Kazke1oTfC/view

is strange, it doesn't even contain AGF blocks; they aren't fuzzed or corrupted,
they are completely zeroed out. I don't know if that's part of the fuzzing,
or what - what steps led to that image?

Or put another way, how did you arrive at the fs image values in the reproducer,
i.e.:

oid loop()
{
memcpy((void*)0x20000000, "xfs", 4);
memcpy((void*)0x20000100, "./file0", 8);
*(uint64_t*)0x20000200 = 0x20010000;
memcpy((void*)0x20010000,
"\x58\x46\x53\x42\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00"
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x9f\x98"
"\x99\xff\xcb\xa1\x4e\xe6\xad\x52\x08\x20\x67\x09\xed\x75\x00\x00\x00"
"\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x35\xe0\x00\x00\x00\x00"
"\x00\x00\x35\xe1\x00\x00\x00\x00\x00\x00\x35\xe2\x00\x00\x00\x01\x00"
"\x00\x10\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x03\x55\xb4\xa4"
"\x02\x00\x01\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"
"\x00\x0c\x09\x08\x04\x0c\x00\x00\x19\x00\x00\x00\x00\x00\x00\x00\x40"
"\x00\x00\x00\x00\x00\x00\x00\x3d\x00\x00\x00\x00\x00\x00\x0c\xa3\x00"
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00"
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x02\x02",
204);

...

The in-memory xfs filesystem it constructs is damaged, is that an intentional
part of the fuzzing during the test?

-Eric