Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

From: Daniel Phillips
Date: Thu Aug 17 2006 - 00:46:27 EST


Evgeniy Polyakov wrote:
On Sun, Aug 13, 2006 at 01:16:15PM -0700, Daniel Phillips (phillips@xxxxxxxxxx) wrote:
Indeed. The rest of the corner cases like netfilter, layered protocol and
so on need to be handled, however they do not need to be handled right now
in order to make remote storage on a lan work properly. The sane thing for
the immediate future is to flag each socket as safe for remote block IO or
not, then gradually widen the scope of what is safe. We need to set up an
opt in strategy for network block IO that views such network subsystems as
ipfilter as not safe by default, until somebody puts in the work to make
them safe.

Just for clarification - it will be completely impossible to login using openssh or some other priveledge separation protocol to the machine due
to the nature of unix sockets. So you will be unable to manage your
storage system just because it is in OOM - it is not what is expected
from reliable system.

The system is not OOM, it is in reclaim, a transient condition that will be
resolved in normal course by IO progress. However you raise an excellent
point: if there is any remote management that we absolutely require to be
available while remote IO is interrupted - manual failover for example -
then we must supply a means of carrying out such remote administration, that
is guaranteed not to deadlock on a normal mode memory request. This ends up
as a new network stack feature I think, and probably a theoretical one for
the time being since we don't actually know of any such mandatory login
that must be carried out while remote disk IO is suspended.

But really, if you expect to run reliable block IO to Zanzibar over an ssh
tunnel through a firewall, then you might also consider taking up bungie
jumping with the cord tied to your neck.

Just pure openssh for control connection (admin should be able to
login).

And the admin will be able to, but in the cluster stack itself we don't
bless such stupidity as emailing an admin to ask for a login in order to
break a tie over which node should take charge of DLM recovery.

Regards,

Da niel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/