RE: [[PATCH v1] 02/37] [CIFS] SMBD: Add structure for SMBD transport
From: Long Li
Date: Mon Aug 14 2017 - 14:11:05 EST
> -----Original Message-----
> From: Stefan Metzmacher [mailto:metze@xxxxxxxxx]
> Sent: Monday, August 14, 2017 6:41 AM
> To: Long Li <longli@xxxxxxxxxxxxx>; Steve French <sfrench@xxxxxxxxx>;
> linux-cifs@xxxxxxxxxxxxxxx; samba-technical@xxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; linux-rdma@xxxxxxxxxxxxxxx; Christoph Hellwig
> <hch@xxxxxxxxxxxxx>
> Subject: Re: [[PATCH v1] 02/37] [CIFS] SMBD: Add structure for SMBD
> transport
>
> Hi Long,
>
> >> It seems that the new transport is tied to it's caller regarding
> >> structures and naming conventions.
> >>
> >> I think it would be better to strictly separate them, as I'd like to
> >> use the SMBDirect transport also from the userspace for the client
> >> side e.g. in Samba's '[lib]smbclient', but also in Samba's server side code
> 'smbd'.
> >
> > Thank you for reviewing the patch set.
> >
> > I think it is possible to separate the common code that implements the
> SMBDirect transport. There are some challenges to reuse the same code for
> both kernel and user spaces.
> > 1. Kernel mode RDMA verbs are similar but different to user-mode ones.
> > 2. Some RDMA features (e.g Fast Registration Work Request) are not
> available in user-mode.
> > 3. Locking and synchronization mechanism is different 4. Memory
> > management is different.
> > 5. Process creation/scheduling and data sharing between processes are
> different, and there is no user-mode code running in interrupt/softirq.
> >
> > Those needs to be abstracted through a layer, the rest of the code can be
> shared. I can work on this after patch set is reviewed.
>
> I guess this is a misunderstanding...
>
> I don't want to use that code and run it in userspace, I have a userspace
> prototype more or less working here, see
> https://git.samba.org/?p=metze/samba/wip.git;a=shortlog;h=refs/heads/m
> aster3-rdma
> and
> https://git.samba.org/?p=metze/samba/wip.git;a=blob;f=libcli/smb/smb_dir
> ect.c;h=9cc0d861ccfcbb4df9ef6ad85a7fe3d262e549c0;hb=85d46de6fdbba041
> d3e8004af46865a72d2b8405
>
> I goal is that we'll have an api that allows userspace code to use the kernel
> code SMBDirect code. This userspace code would get a file descriptor from
> the kernel and would be able to use it similar to a tcp socket.
> If the kernel would simulate the 4 byte length header, it's trivial to get to a
> stage were smbclient and smbd are able to support SMBDirect without much
> changes.
> We only need to replace connect(), listen(), accept() and a few more by
> SMBDirect versions.
This is possible. SMBDirect code can handle the first 4 bytes for upper layer.
>
> For the real data transfer we might be able to use memfd_create() or
> something similar to share the buffers between userspace and kernel.
You'll need to post RDMA read/write through the same QP created by SMBDirect in the kernel. I think this needs some more work but it's doable.
>
> I guess this is a long way, but having the basic SMBDirect code in dependently
> in the kernel would help a lot.
>
> >> Would it be possible to isolate this in smb_direct.c and smb_direct.h
> >> while using
> >> smb_direct_* prefixes for structures and functions? Also avoiding the
> >> usage of other headers from fs/cifs/*.h, expect for something generic like
> nterr.h.
> >
> > Sure I will make naming changes and clean up the header files.
>
> Thanks!
>
> >> I guess 'struct cifs_rdma_info' would then be 'struct
> smb_direct_connection'.
> >> And it won't have a reference to struct TCP_Server_Info.
> >
> > I will look for ways to remove reference to struct TCP_Server_Info . The
> reason why it has a reference to TCP_Server_Info is that: TCP_Server_Info
> represents a transport connection, although it also has many other TCP
> related code. SMBD needs to get to this connection TCP_Server_Info and set
> the transport status on shutdown (and maybe other situations).
> >
>
> Wouldn't it be better to provide a way to ask for the connection state and let
> the caller ask for the state instead of changing the callers structure?
>
> metze