Re: [PATCH v3 2/5] configfs-tsm: Introduce a shared ABI for attestation reports

From: Thomas Gleixner
Date: Fri Sep 01 2023 - 14:06:37 EST


On Thu, Aug 31 2023 at 15:13, Dan Williams wrote:
> Dionna Amalie Glaze wrote:
>> This is clean and approachable. Thanks for your implementation.
>>
>> > +static int try_advance_write_generation(struct tsm_report *report)
>> > +{
>> > + lockdep_assert_held_write(&tsm_rwsem);
>> > +
>> > + /*
>> > + * malicious or broken userspace is attempting to wrap read_generation,
>> > + * stop accepting updates until current report configuration is read.
>> > + */
>> > + if (report->write_generation == report->read_generation - 1)
>> > + return -EBUSY;
>> > + report->write_generation++;
>> > + return 0;
>> > +}
>> > +
>>
>> This took me a while to wrap my head around.
>> The property we want is that when we read outblob, it is the result of
>> the attribute changes since the last read. If we write to an attribute
>> 2^64 times, we could get write_generation to wrap around to equal
>> read_generation without actually reading outblob. So we could end up
>> given a badly cached result when we read outblob? Is that what this is
>> preventing?
>
> Correct. The criticism of kernfs based interfaces like sysfs and
> configfs is that there is no facility to atomically modify a set of
> attributes at once. The expectated mitigation is simply that userspace
> is well behaved. For example, 2 invocations of fdisk can confuse each
> other, so userspace is expected to run them serially and the kernel
> otherwise lets userspace do silly things.
>
> If a tool has any concern that it has exclusive ownership of the
> interface this 'generation' attribute allows for a flow like:
>
> gen=$(cat $report/generation)
> dd if=userdata > $report/inblob
> cat $report/outblob > report
> gen2=$(cat $report/generation)
>
> ...and if $gen2 is not $((gen + 1)) then tooling can detect that the
> "userspace is well behaved" expectation was violated.
>
> Now configs is slightly better in this regard because objects can be
> atomically created. So if 2 threads happen to pick the same name for the
> report object then 'mkdir' will only succeed for one while the other
> gets an EEXIST error. So for containers each can get their own
> submission interface without worrying about other containers.
>
>> I think I would word this to say,
>>
>> "Malicious or broken userspace has written enough times for
>> read_generation == write_generation by modular arithmetic without an
>> interim read. Stop accepting updates until the current report
>> configuration is read."
>
> That works for me.

That's a pretty theoretical problem. Under the assumption that one
syscall takes 50ns the wraparound on a 64bit variable will take ~29247
years to complete.

I think the important point is that the generation check there ensures
that the expected sequence takes place and can't be overwritten by
accident or malice, no?

Thanks,

tglx