Re: Sleeping thread not receive signal until it wakes up

From: Luong Ngo
Date: Fri Mar 09 2007 - 14:51:06 EST


On 3/9/07, Sergey Vlasov <vsu@xxxxxxxxxxx> wrote:
On Thu, 8 Mar 2007 14:52:07 -0800 Luong Ngo wrote:

[...]
> static irqreturn board_isr(int irq, void *dev_id, struct pt_regs* regs)
> {
> spin_lock(&dev->lock);
> if (dev->irqMask & (1 << irqBit)) {
> // Set the interrupt event mask
> dev->irqEvent |= (1 << irqBit);
>
> // Disable this irq, it will be reenabled after processed by board task
> disable_irq(irq);

I assume that your device does not support shared interrupts? If it
does (and a PCI device is required to support them), you cannot use
disable_irq() here (and you need to check a register in the device to
find out if it really did generate an IRQ)...

Yes, the device does not share interrupt.

> static int ats89_ioctl(struct inode *inode, struct file *file, u_int
> cmd, u_long arg)
> {
>
> switch(cmd){
> case GET_IRQ_CMD: {
> u32 regMask32;
>
> spin_lock_irq(dev->lock);
> while ((dev->irqMask & dev->irqEvent) == 0) {
> // Sleep until board interrupt happens
> spin_unlock_irq(dev->lock);
> interruptible_sleep_on(&(dev->boardIRQWaitQueue));
> if (uncond_wakeup) {
> /* don't go back to loop */
> break;
> }
> spin_lock_irq(dev->lock);
> }
>
> uncond_wakeup = 0;
>
> // Board interrupt happened
> regMask32 = dev->irqMask & dev->irqEvent;
> if(copy_to_user(&(((ATS89_IOCTL_S *)arg)->mask32),
> &regMask32, sizeof(u32))) {
> spin_unlock_irq(dev->lock);
> return -EAGAIN;
> }
>
> // Clear the event mask
> dev->irqEvent = 0;
> spin_unlock_irq(dev->lock);
> }
> break;
>
>
> }
> }

And this code is full of bugs:

1) As you have been told already, interruptible_sleep_on() and
sleep_on() functions are broken and should not be used (they are
left in the kernel only to support some obsolete code). Either
use wait_event_interruptible() or work with wait queues directly
(prepare_to_wait(), finish_wait(), ...).

I agree.but as I said our hardware will repeatedly raising
interrupts until it's serviced, the missing wakeup call would be
repeated also, so this should still wake up the sleep_on call. But we
would change it definitely.


2) The code to handle pending signals is missing - you need to have
this after wait_event_interruptible():

if (signal_pending(current))
return -ERESTARTSYS;

(but be careful - you might need to clean up something before
returning).

This is what causes your problem - interruptible_sleep_on()
returns if a signal is pending, but your code does not check for
signals and therefore invokes interruptible_sleep_on() again; but
if a signal is pending, interruptible_sleep_on() returns
immediately, causing your driver to eat 100% CPU looping in kernel
mode until some device event finally happens.

As pointed out by Robert, I added the checking
if(signal_pending(current))
return -ERESTARTSYS;
right after the line interruptible_sleep_on , but I don't see any
difference yet.

3) If uncond_wakeup is set, you break out of the loop with dev->lock
unlocked; however, if dev->irqEvent gets set, you exit the loop
with dev->lock locked. The subsequent code always unlocks
dev->lock, so in the uncond_wakeup case you have double unlock.

Thanks for catching it

4) You are doing copy_to_user() while holding a spinlock - this is
prohibited (as any other form of sleep inside a spinlock).

Thanks again. But may I ask if it is prohibited, how come it has
been running without any error?

5) The return code for the copy_to_user() failure is wrong - it
should be -EFAULT (this is not a fatal bug, but an annoyance for
users of your driver, who might get such nonstandard error codes
while debugging their programs and wonder what is going on).

changed.


Thank you for your input.

-LNgo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/