Probable module bug in linux-2.6.5-1.358
From: Richard B. Johnson
Date: Wed Oct 06 2004 - 17:20:19 EST
The attached script shows that an attempt to open a device
after its module was removed, will seg-fault the kernel.
Cheers,
Dick Johnson
Penguin : Linux version 2.6.5-1.358-noreg on an i686 machine (5537.79 BogoMips).
Note 96.31% of all statistics are fiction.
Kernel 2.6.5-1.358 from Red Hat Fedora has a bug that
causes a kernel oops when attempting to open a device
that had a module removed.
In the following, the first attempt at a user-mode open()
causes the module to be properly loaded by modprobe, the
module alias having been put into /etc/modprobe.conf.
The module can be removed and inserted many times, each
time the resources are deallocated and the module has no
problem obtaining those resources again when it's reloaded.
However, only the FIRST automatic load works. Any attempt
to access the device (open it) after it has been removed,
will not cause it to be reloaded. Instead the kernel will
seg-fault during a call to sys_open().
The virtual address 0x222e7880 is not in the symbol file and
seems to be the address near where the open() in my module
used to be before it was removed.
I can duplicate this and a printk("%p\n", open); shows
0x222e74a0, just after I boot. Subsequent loads show other
addresses.
Oct 6 17:03:30 chaos kernel: Analogic Corp Datalink Driver : Module removed
Oct 6 17:03:38 chaos kernel: Unable to handle kernel paging request at virtual address 222e7880
Oct 6 17:03:38 chaos kernel: printing eip:
Oct 6 17:03:38 chaos kernel: 021556ad
Oct 6 17:03:38 chaos kernel: *pde = 1f30c067
Oct 6 17:03:38 chaos kernel: *pte = 00000000
Oct 6 17:03:38 chaos kernel: Oops: 0000 [#1]
Oct 6 17:03:38 chaos kernel: SMP
Oct 6 17:03:38 chaos kernel: CPU: 0
Oct 6 17:03:38 chaos kernel: EIP: 0060:[<021556ad>] Not tainted
Oct 6 17:03:38 chaos kernel: EFLAGS: 00010206 (2.6.5-1.358-noreg)
Oct 6 17:03:38 chaos kernel: EIP is at cdev_get+0x14/0x6a
Oct 6 17:03:38 chaos kernel: eax: 20dee000 ebx: 222e7880 ecx: 1f564a80 edx: 13d19910
Oct 6 17:03:38 chaos kernel: esi: 00000000 edi: 1edae208 ebp: 00000000 esp: 20deef20
Oct 6 17:03:38 chaos kernel: ds: 007b es: 007b ss: 0068
Oct 6 17:03:38 chaos kernel: Process ftest (pid: 3728, threadinfo=20dee000 task=1f7a0130)
Oct 6 17:03:38 chaos kernel: Stack: 13d19910 00000000 021554bc 13d19910 1f564a80 1f564a80 1edae208 2138bf80
Oct 6 17:03:38 chaos kernel: 1d49ac8c 0214d443 1edae208 1f564a80 00000002 00000003 1685a000 20dee000
Oct 6 17:03:38 chaos kernel: 0214d364 20e18180 2138bf80 00000002 20e18180 2138bf80 00000ffc 00000001
Oct 6 17:03:38 chaos kernel: Call Trace:
Oct 6 17:03:38 chaos kernel: [<021554bc>] chrdev_open+0xb2/0x18b
Oct 6 17:03:38 chaos kernel: [<0214d443>] dentry_open+0xd7/0x19b
Oct 6 17:03:38 chaos kernel: [<0214d364>] filp_open+0x41/0x49
Oct 6 17:03:38 chaos kernel: [<0214d703>] sys_open+0x31/0x82
Oct 6 17:03:38 chaos kernel:
Oct 6 17:03:38 chaos kernel: Code: 83 3b 02 8b 40 10 74 0e c1 e0 07 8d 04 18 ff 80 00 01 00 00
Oct 6 17:08:11 chaos syslogd 1.4.1: restart.
Script started on Wed 06 Oct 2004 05:09:06 PM EDT
# sync
# ./ftest
Make sure a single board exists in a PCI slot and
channel A output is connected to channel B input.
Hit [Enter] to continue...
Enabling both RX and TX
Waiting for synchronization...
Waiting for synchronization...
Mailbox-A message = 0000000f
Mailbox-B message = 0000000f
Len = 6 Mb, time = 104468.00 us, rate = 64.334954 bytes/usec
[SNIPPED...]
Works...
PCI slot = 516
Driver version = V2.01
FPGA version = 61128V00R000
# ps laxw
F UID PID PPID PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
4 0 1 0 16 0 3124 464 - S ? 0:04 init [5]
[SNIPPED...]
4 0 3128 3127 15 0 6080 1276 wait4 S pts/0 0:00 bash -i
1 5418 3137 1 15 0 0 0 - SW ? 0:00 [DLB daemon]
This shows the kernel daemon that manages the link.
4 0 3140 3128 17 0 3824 584 - R pts/0 0:00 ps laxw
# cat /proc/iomem
00000000-0009fbff : System RAM
0009fc00-0009ffff : reserved
000a0000-000bffff : Video RAM area
000c0000-000cfbff : Video ROM
000d0000-000d0bff : Adapter ROM
000d1000-000d57ff : Adapter ROM
000f0000-000fffff : System ROM
00100000-1f3fffff : System RAM
00100000-002a8fff : Kernel code
002a9000-0034fa7f : Kernel data
1f400000-1f4003ff : 0000:00:1f.1
1f401000-22403fff : Analogic Corp Datalink Driver
e4500000-f45fffff : PCI Bus #01
e8000000-efffffff : 0000:01:00.0
f4580000-f45fffff : 0000:01:00.0
f8000000-fbffffff : 0000:00:00.0
fc700000-fe7fffff : PCI Bus #01
fd000000-fdffffff : 0000:01:00.0
fe900000-fe9fffff : 0000:02:04.0
fe900000-fe9fffff : Analogic Corp Datalink Driver
feafc000-feafcfff : 0000:02:08.0
feafc000-feafcfff : e100
feafd000-feafdfff : 0000:02:07.0
feafe800-feafe9ff : 0000:02:04.0
feafe800-feafe9ff : Analogic Corp Datalink Driver
feafec00-feafec7f : 0000:02:01.0
feaff000-feafffff : 0000:02:00.0
feaff000-feafffff : aic7xxx
febff400-febff4ff : 0000:00:1f.5
febff800-febff9ff : 0000:00:1f.5
febffc00-febfffff : 0000:00:1d.7
# sync
This shows resources used.
# rmmod HeavyLink
I remove the module okay.
# cat /proc/iomem
00000000-0009fbff : System RAM
0009fc00-0009ffff : reserved
000a0000-000bffff : Video RAM area
000c0000-000cfbff : Video ROM
000d0000-000d0bff : Adapter ROM
000d1000-000d57ff : Adapter ROM
000f0000-000fffff : System ROM
00100000-1f3fffff : System RAM
00100000-002a8fff : Kernel code
002a9000-0034fa7f : Kernel data
1f400000-1f4003ff : 0000:00:1f.1
e4500000-f45fffff : PCI Bus #01
e8000000-efffffff : 0000:01:00.0
f4580000-f45fffff : 0000:01:00.0
f8000000-fbffffff : 0000:00:00.0
fc700000-fe7fffff : PCI Bus #01
fd000000-fdffffff : 0000:01:00.0
fe900000-fe9fffff : 0000:02:04.0
feafc000-feafcfff : 0000:02:08.0
feafc000-feafcfff : e100
feafd000-feafdfff : 0000:02:07.0
feafe800-feafe9ff : 0000:02:04.0
feafec00-feafec7f : 0000:02:01.0
feaff000-feafffff : 0000:02:00.0
feaff000-feafffff : aic7xxx
febff400-febff4ff : 0000:00:1f.5
febff800-febff9ff : 0000:00:1f.5
febffc00-febfffff : 0000:00:1d.7
The resources are properly freed.
# ps laxw
F UID PID PPID PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
4 0 1 0 16 0 3124 464 - S ? 0:04 init [5]
[SNIPPED...]
4 42 3088 2989 16 0 20992 9148 - S ? 0:00 /usr/bin/gdmgreeter
4 0 3089 2624 15 0 4676 1336 wait4 S tty1 0:00 -bash
4 0 3126 3089 15 0 4632 464 - S tty1 0:00 script
5 0 3127 3126 15 0 4636 496 - S tty1 0:00 script
4 0 3128 3127 15 0 6080 1276 wait4 S pts/0 0:00 bash -i
4 0 3154 3128 17 0 2444 588 - R pts/0 0:00 ps laxw
The daemon is gone, properly exited.
# insmod HeavyLink.ko
Insert again. Works.
# ./ftest
Make sure a single board exists in a PCI slot and
channel A output is connected to channel B input.
Hit [Enter] to continue...
Enabling both RX and TX
Waiting for synchronization...
Waiting for synchronization...
Mailbox-A message = 0000000f
Mailbox-B message = 0000000f
Len = 8 Mb, time = 134457.00 us, rate = 64.362763 bytes/usec
Mailbox-B message = 00000000
[SNIPPED...]
Works fine.....
#
# lsmod | grep Heavy
HeavyLink 40456 0
# rmmod HeavyLink
a
Remove the module as before. Fine.
# sync
# ./ftest
Now, try to open the device without first manually inserting
the module.
Segmentation fault
DEATH with a console screen display that doesn't get saved anywhere.
The last call was sys_open()
# exit
Script done on Wed 06 Oct 2004 05:12:20 PM EDT