Re: [PATCH 08/28] kdb: core for kgdb back end (2 of 2)

From: Scott Lurndal
Date: Thu Feb 18 2010 - 13:23:15 EST


On Thu, Feb 18, 2010 at 09:04:36AM -0600, Jason Wessel wrote:
> Eric W. Biederman wrote:
> > Jason Wessel <jason.wessel@xxxxxxxxxxxxx> writes:
> >
> >
> >> This patch contains the hooks and instrumentation into kernel which
> >> live outside the kernel/debug directory, which the kdb core
> >> will call to run commands like lsmod, dmesg, bt etc...
> >>
> >
> > You know this dropping the locks from vmalloc_info and swap_info
> > is down right ugly, and I don't believe it is safe. That code
> > was not designed to run while the write_lock is held.
> >
>
> Perhaps we can find some middle ground. I don't mind simply not
> allowing the information to be queried from kdb if the locks are not
> available.

IIRC the original KDB would stop all the cpus when entered,
thus locking to avoid concurrent access to data
was not necessary when displaying kernel data structures. However,
KDB user and developers were assumed to be aware that when KDB was
entered the system context was in an indeterminate state particularly
with respect to linked lists and other non-tabular data structures.

KDB code that displayed data structures which were kept in a non-table
data structure (linked list, tree, etc.) was be required to both
validate each pointer it tries to follow as well as ensure that it
detects loops (either by terminating the list traversal after a certain
number of elements or by allowing the KDB user to terminate the traversal
with e.g. 'q').

>
> It looks to me like the original kdb took the approach of calling the
> setjmp() longjmp() and if there was any kind of fault, it long jumped
> back to the original context. Obviously that doesn't solve any kind of
> problem with a list loop.

Yes. The list loop was expected to be handled either by the display
code terminating after some number of traversal step or by the KDB user
terminating the command via the keyboard (e.g. 'q' at a more-type prompt).

If the new KDB framework allows other cpus to continue to run while kdb
data structure display commands are running, then much more care must
be taken in the display command code to avoid inconsistent data causing
loops or #PF.


scott
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/