Re: [BUGFIX 2/2] gdth: bugfix for the Timer at exit crash

From: James Bottomley
Date: Wed Feb 13 2008 - 12:05:29 EST


On Wed, 2008-02-13 at 18:50 +0200, Boaz Harrosh wrote:
> On Wed, Feb 13 2008 at 18:45 +0200, James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote:
> > On Wed, 2008-02-13 at 18:33 +0200, Boaz Harrosh wrote:
> >> On Wed, Feb 13 2008 at 17:54 +0200, Boaz Harrosh <bharrosh@xxxxxxxxxxx> wrote:
> >>> On Wed, Feb 13 2008 at 17:44 +0200, James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote:
> >>>> On Tue, 2008-02-12 at 19:40 +0200, Boaz Harrosh wrote:
> >>>>> - gdth_flush(ha);
> >>>>> -
> >>>> This piece doesn't look right. gdth_flush() forces the internal cache
> >>>> to disk backing. If you remove it, you're taking the chance that the
> >>>> machine will be powered off without a writeback which can cause data
> >>>> corruption.
> >>>>
> >>>> James
> >>>>
> >>> Yes.
> >>> I have more problems reported, with exit, and am just sending one more patch that puts
> >>> this back in. Which was tested.
> >>>
> >>> So I will resend this one plus one new one.
> >>>
> >>> Boaz
> >>>
> >> The gdth driver would do a register_reboot_notifier(&gdth_notifier);
> >> to a gdth_halt() function, which would then redo half of what gdth_exit
> >> does, and wrongly so, and crash.
> >>
> >> Are we guaranteed in todays kernel that modules .exit function be called
> >> on an halt or reboot? If so then there is no need for duplications and
> >> the gdth_halt() should go.
> >
> > No. The __exit section is actually discardable if you promise never to
> > remove the module.
> >
> I don't understand please explain.
> What does a driver need to do if it needs a consistent shutdown retine?
> module or built in? unload or shutdown?

It needs to register a reboot notifier, which gdth does.

However, the notifier is only called on reboot, so it also needs to
clean up correctly on module exit as well.

The alternative for GDTH would be to process the SCSI SYNCHRONIZE CACHE
command. That's done by a shutdown notifier from sd, so the correct
thing would always get done; however it does mean the driver has to be
in a condition to process the last sync cache command.

For the quick fix, just keep the current infrastructure and put back the
gdth_flush() command where it can be effective.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/