Re: [RFC PATCH] keys: flush work when accessing /proc/key-users

From: Luis Henriques
Date: Wed Dec 06 2023 - 12:56:08 EST


David Howells <dhowells@xxxxxxxxxx> writes:

> Luis Henriques <lhenriques@xxxxxxx> wrote:
>
>> This patch is mostly for getting some feedback on how to fix an fstest
>> failing for ext4/fscrypt (generic/581). Basically, the test relies on the
>> data read from /proc/key-users to be up-to-date regarding the number of
>> keys a given user currently has. However, this file can't be trusted
>> because it races against the keys GC.
>
> Unfortunately, I don't think your patch helps. If the GC hasn't started yet,
> it won't achieve anything and the GC can still be triggered at any time after
> the flush and thus race.
>
> What is it you're actually trying to determine?
>
> And is it only for doing the test?

OK, let me try to describe what the generic/581 fstest does.

After doing a few fscrypt related things, which involve adding and
removing keys, the test will:

1. Get the number of keys for user 'fsgqa' from '/proc/key-users'
2. Set the maxkeys to 5 + <keys the user had in 1.>
3. In a loop, try to add 6 new keys, to confirm the last one will fail

Most of the time the test passes, i.e., the 6th key fails to be added.
However, if, for example, the test is executed in a loop, it is possible
to have it fail because the 6th key was successfully added. The reason
is, obviously, because the test is racy: the GC can kick-in too late,
after the maxkeys is set in step 2.

So, this is mostly an issue with the test itself, but I couldn't figure
out a way to work around it.

Another solution I thought but I didn't look too deep into was to try to
move the

atomic_dec(&key->user->nkeys);

out of the GC, in function key_gc_unused_keys(). Decrementing it
synchronously in key_put() (or whatever other functions could schedule GC)
should solve the problem with this test. But as I said I didn't went too
far looking into that, so I don't really know if that's feasible.

Finally, the test itself could be hacked so that the loop in step 3. would
update the maxkeys value if needed, i.e. if the current number of keys for
the user isn't what was expected in each loop iteration. But even that
would still be racy.

Cheers,
--
Luís