Re: smbfs: page write patch, dropping mounts, tunneling in 2.2.1

Michael H. Warfield (mhw@wittsend.com)
Wed, 17 Feb 1999 20:32:39 -0500 (EST)


vherva@turing.netspan.fi enscribed thusly:
> > From: Trond Myklebust <trond.myklebust@fys.uio.no>
> > Date: 31 Jan 1999 11:30:08 +0100
> > Subject: SMBfs: bug in 2.2.1 ?

> > I've noticed what seems to be a bug in the SMBfs page write code. It seems
> > that the code in generic_file_write has changed fairly recently, and
> > it is now a bug for the 'updatepage' code to clear the page lock and
> > serve the wait queue. Unfortunately nobody seems to have removed the
> > call to 'smb_unlock_page(page)' in 'smb_writepage_sync'.

There seems to be some code changes in smb_updatepage in 2.2.1ac6
that may have been in some earlier ac changes. This may have fixed
the problem in a different style. It adds some setbit and clearbit as
well as an smb_unlock_page... Sound related...

> > Would people who've been seeing problems with the smbfs client in
> > 2.2.1 please try the appended one-liner and see if it works?

> I'm not aware of what kind of symtoms the bug shoud have caused, but it seems
> that it fixed at least one on my 2.2.1-ac5. Before, the mounts would drop when
> NT peer would close the socket. After this, any access to the mount -- ls, cd,
> umount, or even df -- would hang and refuse to obey SIGKILL. After applying the
> patch, the mounts still drop, but the hanging processes can be SIGINT'ed.
> Moreover the processes seem to hang less often. I seem to remember similar
> enhancement updating from 2.0.34 to 2.0.36 -- perhaps there was a similar
> change in the code?

I don't see any changes in the patch for 2.2.1ac6 that would account
for this. There are some changes for some page locking, but I'm not
sure just when they went in. My fix for smb_fs.h went in back in ac4
or ac5 (I forget which).

I've been trying to find out who is maintaining smbfs in the kernel
(I maintain smbmount in the Samba package). Last I knew, it was Bill
Hawes but attempts to contact him have been met with silence. I can't
say that I am quite ready to put out a call for a new maintainer nor
will I say I'm prepared to step up to the plate myself. I would like to
hear from Bill before anything else...

> The dropping mounts, however, were not cured. With 2.0.36 there is no problem
> with NT (or other smb server) closing the socket -- smbfs seems to transparently
> reconnect when needed. With 2.2.1, the mounts drop after few minutes of idle time

Huh??? The old smbfs did not contain any retry logic for
restablishing connections if they dropped. That was WHY the old
smbmount was dropped in favor of one based on Samba. Volker did
that long before I came on the scene. It is true that the connections
were cleaned up and you could remount, but smbfs had no logic for
redoing the login and tree connect if the server at the other end was
lost. That's what the new daemon code in the Samba smbmount is for.

> when NT closes the socket. (With samba, the problem does not show up, at least
> that often, since samba does not close the socket all that often.) I suppose this
> behaviour has something to do with the recent smbfs delayed mount discussion. As

Not that I'm aware...

> I understood, in 2.0 the smbmount would fork a child that did remounting in the
> background every time smbfs asked it to. Now that the smbmount scheme has changed,
> this does not happen any more, am I correct?

No. This is the way it is suppose to work and should now be working.

We had some problems with the daemon code that are finally fixed.
Originally, the Samba version of smbmount would go daemon and immediately
exit. The daemon would then establish the mount and work from there.
Unfortunately, there was a timing problem with this when used with automatic
processes such as scripts and autofs. The smbmount parent would exit back
to the calling process before the mount was complete. This caused various
amusing random acts of terrorism. :-)

My original fix for this problem in Samba 2.0.0 seemed to fix the
timing problem but broke the reconnect logic (wrong process was registered
with the smbfs module). When I managed to fix BOTH of these problems, I
ran smack into a deadly embrace problem between autofs and smbmount that
had been there since day one. It turned out that the timing problem
allowed the parent to exit prematurely and break the deadly embrace. My
original fix didn't run into the deadly embrace because it only affected
the daemon child and I had the parent doing the initial mount. Thanks
to a critical hint from HPA, I was finally able to fix the timing problem
and the deadly embrace and the process registration problem together. That
fix is in Samba 2.0.2 along with one fix for a glibc / kernel problem in
smbumount that would cause a "not an SMB filesystem" error when attempting
to unmount.

There still remained a couple of niggling problems including one
more uid_t / __kernel_uid_t conflict in smbumount and a reconnect problem
in the case of passwords entered from the keyboard. There was also a
problem where smbmount would not clean up a mount if it later ran into
a fatal error and exited the system. That left the mount point in a very
bad state that resulted in I/O errors and refusals to mount. These are
now fixed and in the CVS tree as well as the Samba snapshots.

I still have a remaining problem that I'm not getting /etc/mnttab
cleaned up properly when smbmount fails with a fatal error on a retry
and reconnect, but that's better than what it was doing. I hope to have
that one cleared before too long so I can start working on some sort of
syntax compatibility mode.

I also had a problem (just solved in the last 10 seconds) with
smbmount working through autofs. I have a shim script which does the
syntax translations from the old syntax to the new syntax but it always
seemed to hang if I just let smbmount pipe it's output back to autofs
and over to syslog. I had to dump that output to a file and then cat
it out to stdout in order to avoid the hang. That turns out to be a
problem whenever debugging is enabled in smbmount. Apparently automount
is waiting on the input pipe to close rather than waiting for the parent
to exit. In debug mode the daemon was holding the pipe open and automount
was never allowed to run. The fix for that for now is to turn debugging
off and label it with a big bad warning not to turn it on if autofs is
being used. I'm not sure if the "proper" fix is to change autofs/automount
to realize that the process exited even if the pipe remains open or have
smbmount detect that it's a pipe on stdout and always close it even when
in debugging mode. Damned if you do and damned if you don't here...

That last fix will be in the Samba CVS tree in a few minutes and
I will be updating my smbmount.sh script a few days later after that
change has been propagated a bit more.

I'll post some more information on the shim script on my web site
a little later...

<http://www.wittsend.com/mhw/smbmount.html>

> There's yet one more thing strange thing with smbfs. I use ssh to create a secure
> channel through which I -- among other things -- mount shares with smbfs. If the
> throughput without ssh is, say, 200KB/s, through ssh it is only 20-40KB/s. With
> smbclient, however, the throughput only drops to something like 180KB/s. With
> sharity through ssh, the performance is ~150KB/s. The slowdown is not specific to
> ssh tunneling, tunneling through socket (the executable that creates a socket to
> a specific destination) had similar effect. I seem to remember that NT had
> similar loss of smb performance when tunneling through ssh (which was a heck of
> a battle to get working at the first place.)
>
> Obviously, there is something in the smbfs code that causes the tunneling to be
> slow. I took a quick look, but did not find anything. Could anyone point me
> an exact spot to suspect, so I could compare that to the smbclient equivalent.

Can't help on this one at this time...

> -- v --
>
> --
> Ville Herva Ville.Herva@netspan.fi +358-50-5164500
> Netspan Oy netspan@netspan.fi PL 65 FIN-02151 Espoo
> http://www.netspan.fi
> For my PGP key, finger vherva@netspan.fi.

Mike

-- 
 Michael H. Warfield    |  (770) 985-6132   |  mhw@WittsEnd.com
  (The Mad Wizard)      |  (770) 925-8248   |  http://www.wittsend.com/mhw/
  NIC whois:  MHW9      |  An optimist believes we live in the best of all
 PGP Key: 0xDF1DD471    |  possible worlds.  A pessimist is sure of it!

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/