Re: [PATCH 3/9] VFS: Introduce a mount context
From: Jeff Layton
Date: Wed May 10 2017 - 09:48:59 EST
On Wed, 2017-05-10 at 15:30 +0200, Miklos Szeredi wrote:
> On Wed, May 10, 2017 at 3:20 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > On Wed, 2017-05-10 at 09:05 +0100, David Howells wrote:
> > > Miklos Szeredi <mszeredi@xxxxxxxxxx> wrote:
> > >
> > > > Possible rule of thumb: use it only at the place where the error
> > > > originates and not where errors are just passed on. This would result
> > > > in at most one report per syscall, normally.
> > > >
> >
> > That might be hard to enforce in practice once you get into some
> > complicated layering. What if we have device_mapper setting this along
> > with filesystems too? We need clear rules here.
>
> If the error originates in the devicemapper, then why would the
> filesystem set it?
>
> There's always a root cause of an error and that should be where the
> detailed error is set.
>
> Am I missing something?
>
I was thinking that you'd need some well-defined way to tell whether the
string should be replaced. If the thing just hangs out across syscalls,
then you don't know when it got put there. Is it a leftover from a
previous syscall or did a lower layer just put it there?
But...maybe I'm making assumptions about how this would work and I
should just wait until there are patches in flight. Getting the lifetime
of these strings right will be crucial though.
> >
> > > > And the static string thing that David implemented is also a very good
> > > > idea, IMO.
> > >
> > > There is an issue with it: it's fine as long as you keep a ref on the module
> > > that generated it or clear all strings as part of module removal (which the
> > > mount context in this patchset does). With the NFS mount context I did, I
> > > have to keep a ref on the NFS protocol module as well as the NFS filesystem
> > > module.
> > >
> > > I'm tempted to make it conditionally copy the string using kvasprintf_const()
> > > - which would also permit format substitution.
> > >
> >
> > On balance, I think this is a reasonable way to pass back detailed
> > errors. Up until now, we've mostly relied on just printk'ing them. Now
> > though, a lot of larger machines are running containerized setups. Good
> > luck scraping dmesg for _your_ error in that situation. There may be
> > tons of mounts failing all over the place.
> >
> > That said, I have some concerns here:
> >
> > What's the lifetime of these strings? Do they just hang around forever
> > until the process goes away or they're replaced? If this becomes common,
> > then you could easily end up with an extra string allocation per task in
> > some cases. That could add up.
>
> That's why I liked the static string thing. It's just one assignment
> and no worries about freeing. Not sure what to do about modules,
> though. Can we somehow move the cost of checking the validity to the
> place where the error is retrieved?
>
Seems a little dangerous, and could be limiting. Dynamically allocated
strings seem like they could be more useful.
> >
> > One idea might be to always kfree it on syscall entry, and that might
> > mitigate the problem assuming that not everything is erroring out. Then
> > you could always do some trivial syscall to clear it manually.
> >
> > There's also the problem of how these should be formatted. Is English ok
> > everywhere? Do we need a facility to allow translating these things?
>
> Messages in dmesg are in English too. If necessary userspace will do
> the translation. I don't think the kernel would need to worry about
> that.
Fair enough. It _is_ still an improvement over dmesg, IMO.
--
Jeff Layton <jlayton@xxxxxxxxxx>