Re: [PATCH v2 2/3] fanotify: define struct members to hold response decision context
From: Jan Kara
Date: Thu May 05 2022 - 10:45:06 EST
On Tue 03-05-22 21:33:35, Richard Guy Briggs wrote:
> On 2022-05-02 20:16, Paul Moore wrote:
> > On Thu, Apr 28, 2022 at 8:45 PM Richard Guy Briggs <rgb@xxxxxxxxxx> wrote:
> > > This patch adds 2 structure members to the response returned from user
> > > space on a permission event. The first field is 16 bits for the context
> > > type. The context type will describe what the meaning is of the second
> > > field. The default is none. The patch defines one additional context
> > > type which means that the second field is a 32-bit rule number. This
> > > will allow for the creation of other context types in the future if
> > > other users of the API identify different needs. The second field size
> > > is defined by the context type and can be used to pass along the data
> > > described by the context.
> > >
> > > To support this, there is a macro for user space to check that the data
> > > being sent is valid. Of course, without this check, anything that
> > > overflows the bit field will trigger an EINVAL based on the use of
> > > FAN_INVALID_RESPONSE_MASK in process_access_response().
> > >
...
> > > static ssize_t fanotify_write(struct file *file, const char __user *buf, size_t count, loff_t *pos)
> > > {
> > > - struct fanotify_response response = { .fd = -1, .response = -1 };
> > > + struct fanotify_response response;
> > > struct fsnotify_group *group;
> > > int ret;
> > > + size_t size = min(count, sizeof(struct fanotify_response));
> > >
> > > if (!IS_ENABLED(CONFIG_FANOTIFY_ACCESS_PERMISSIONS))
> > > return -EINVAL;
> > >
> > > group = file->private_data;
> > >
> > > - if (count < sizeof(response))
> > > + if (count < offsetof(struct fanotify_response, extra_info_buf))
> > > return -EINVAL;
> >
> > Is this why you decided to shrink the fanotify_response:response field
> > from 32-bits to 16-bits? I hope not. I would suggest both keeping
> > the existing response field as 32-bits and explicitly checking for
> > writes that are either the existing/compat length as well as the
> > newer, longer length.
>
> No. I shrank it at Jan's suggestion. I think I agree with you that
> the response field should be kept at u32 as it is defined in userspace
> and purge the doubt about what would happen with a new userspace with
> an old kernel.
Hum, for the life of me I cannot find my response you mention here. Can you
send a link so that I can refresh my memory? It has been a long time...
> > > +
> > > +#define FANOTIFY_RESPONSE_EXTRA_LEN_MAX \
> > > + (sizeof(union { \
> > > + struct fanotify_response_audit_rule r; \
> > > + /* add other extra info structures here */ \
> > > + }))
> > > +
> > > struct fanotify_response {
> > > __s32 fd;
> > > - __u32 response;
> > > + __u16 response;
> > > + __u16 extra_info_type;
> > > + char extra_info_buf[FANOTIFY_RESPONSE_EXTRA_LEN_MAX];
> > > };
> >
> > Since both the kernel and userspace are going to need to agree on the
> > content and formatting of the fanotify_response:extra_info_buf field,
> > why is it hidden behind a char array? You might as well get rid of
> > that abstraction and put the union directly in the fanotify_response
> > struct. It is possible you could also get rid of the
> > fanotify_response_audit_rule struct this way too and just access the
> > rule scalar directly.
>
> This does make sense and my only concern would be a variable-length
> type. There isn't any reason to hide it. If userspace chooses to use
> the old interface and omit the type field then it defaults to NONE.
>
> If future types with variable data are defined, the first field could be
> a u32 that unions with the rule number that won't change the struct
> size.
Struct fanotify_response size must not change, it is part of the kernel
ABI. In particular your above change would break userspace code that is
currently working just fine (e.g. allocating 8 bytes and expecting struct
fanotify_response fits there, or just writing sizeof(struct
fanotify_response) as a response while initializing only first 8 bytes).
How I'd suggest doing it now (and I'd like to refresh my memory from my
past emails you mention because in the past I might have thought something
else ;)) is that you add another flag to 'response' field similar to
FAN_AUDIT - like FAN_EXTRA_INFO. If that is present, it means extra info is
to be expected after struct fanotify_response. The extra info would always
start with a header like:
struct fanotify_response_info_header {
__u8 info_type;
__u8 pad;
__u16 len; /* This is including the header itself */
}
And after such header there would be the 'blob' of data 'len - header size'
long. We use this same scheme when passing fanotify events to userspace
and it has proven to be lightweight and extensible. It covers the situation
when in the future audit would decide it wants other data (just change data
type), it would also cover the situation when some other subsystem wants
its information passed as well - there can be more structures like this
attached at the end, we can process the response up to the length of the
write.
Now these are just possible future extensions making sure we can extend the
ABI without too much pain. In the current implementation I'd just return
EINVAL whenever more than FANOTIFY_RESPONSE_MAX_LEN (16 bytes) is written
and do very strict checks on what gets passed in. It is also trivially
backwards compatible (old userspace on new kernel works just fine).
If you want to achieve compatibility of running new userspace on old kernel
(I guess that's desirable), we have group flags for that - like we
introduced FAN_ENABLE_AUDIT to allow for FAN_AUDIT flag in response we now
need to add a flag like FAN_EXTENDED_PERMISSION_INFO telling the kernel it
should expect an allow more info returning for permission events. At the
same time this is the way for userspace to be able to tell whether the
kernel supports this. I know this sounds tedious but that's the cost of
extending the ABI in the compatible way. We've made various API mistakes in
the past having to add weird workarounds to fanotify and we don't want to
repeat those mistakes :).
One open question I have is what should the kernel do with 'info_type' in
response it does not understand (in the future when there are possibly more
different info types). It could just skip it because this should be just
additional info for introspection (the only mandatory part is in
fanotify_response, however it could surprise userspace that passed info is
just getting ignored. To solve this we would have to somewhere report
supported info types (maybe in fanotify fdinfo in proc). I guess we'll
cross that bridge when we get to it.
Amir, what do you think?
Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR