Re: [PATCH net-next v2 1/4] vm_sockets: Include flags field in the vsock address data structure

From: Stefano Garzarella
Date: Wed Dec 09 2020 - 05:50:01 EST


On Tue, Dec 08, 2020 at 10:42:22AM -0800, Jakub Kicinski wrote:
On Tue, 8 Dec 2020 20:23:24 +0200 Paraschiv, Andra-Irina wrote:
>> --- a/include/uapi/linux/vm_sockets.h
>> +++ b/include/uapi/linux/vm_sockets.h
>> @@ -145,7 +145,7 @@
>>
>> struct sockaddr_vm {
>> __kernel_sa_family_t svm_family;
>> - unsigned short svm_reserved1;
>> + unsigned short svm_flags;
>> unsigned int svm_port;
>> unsigned int svm_cid;
>> unsigned char svm_zero[sizeof(struct sockaddr) -
> Since this is a uAPI header I gotta ask - are you 100% sure that it's
> okay to rename this field?
>
> I didn't grasp from just reading the patches whether this is a uAPI or
> just internal kernel flag, seems like the former from the reading of
> the comment in patch 2. In which case what guarantees that existing
> users don't pass in garbage since the kernel doesn't check it was 0?

That's always good to double-check the uapi changes don't break / assume
something, thanks for bringing this up. :)

Sure, let's go through the possible options step by step. Let me know if
I get anything wrong and if I can help with clarifications.

There is the "svm_reserved1" field that is not used in the kernel
codebase. It is set to 0 on the receive (listen) path as part of the
vsock address initialization [1][2]. The "svm_family" and "svm_zero"
fields are checked as part of the address validation [3].

Now, with the current change to "svm_flags", the flow is the following:

* On the receive (listen) path, the remote address structure is
initialized as part of the vsock address init logic [2]. Then patch 3/4
of this series sets the "VMADDR_FLAG_TO_HOST" flag given a set of
conditions (local and remote CID > VMADDR_CID_HOST).

* On the connect path, the userspace logic can set the "svm_flags"
field. It can be set to 0 or 1 (VMADDR_FLAG_TO_HOST); or any other value
greater than 1. If the "VMADDR_FLAG_TO_HOST" flag is set, all the vsock
packets are then forwarded to the host.

* When the vsock transport is assigned, the "svm_flags" field is
checked, and if it has the "VMADDR_FLAG_TO_HOST" flag set, it goes on
with a guest->host transport (patch 4/4 of this series). Otherwise,
other specific flag value is not currently used.

Given all these points, the question remains what happens if the
"svm_flags" field is set on the connect path to a value higher than 1
(maybe a bogus one, not intended so). And it includes the
"VMADDR_FLAG_TO_HOST" value (the single flag set and specifically used
for now, but we should also account for any further possible flags). In
this case, all the vsock packets would be forwarded to the host and
maybe not intended so, having a bogus value for the flags field. Is this
possible case what you are referring to?

Correct. What if user basically declared the structure on the stack,
and only initialized the fields the kernel used to check?

This problem needs to be at the very least discussed in the commit
message.


I agree that could be a problem, but here some considerations:
- I checked some applications (qemu-guest-agent, ncat, iperf-vsock) and all use the same pattern: allocate memory, initialize all the sockaddr_vm to zero (to be sure to initialize the svm_zero), set the cid and port fields.
So we should be safe, but of course it may not always be true.

- For now the issue could affect only nested VMs. We introduced this support one year ago, so it's something new and maybe we don't cause too many problems.

As an alternative, what about using 1 or 2 bytes from svm_zero[]?
These must be set at zero, even if we only check the first byte in the kernel.

Thanks,
Stefano