linux-2.4.20_mvl31 OOPS ip_route_input called within ip_rt_init

From: Cam
Date: Thu Jul 22 2004 - 12:57:25 EST


Hi,

I have an embedded ppc linux which sometimes can oops at boot time if it receives a packet during ip_rt_init. There is a printk in there in the middle of initialisation of rt_hash_table which can introduce a delay which makes the problem more likely to happen,

Output before the oops:
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP, IGMP
IP: routing cache hash table of 4096 buckets, 32Kbytes
<4>Oops: kernel access of bad area, sig: 11

Decoded trace (my arrows):

Trace; c00ffcfc <ip_route_input+40/21c> <--
Trace; c010339c <ip_rcv_finish+24c/258>
Trace; c00f0af8 <netif_receive_skb+118/224>
Trace; c00f0ca0 <process_backlog+9c/178>
Trace; c00f0e14 <net_rx_action+98/18c>
Trace; c001cfec <do_softirq+134/138>
Trace; c000723c <do_IRQ+c4/c8>
Trace; c00072c0 <preempt_intercept+80/84>
Trace; c0005ecc <ret_from_intercept+0/8c>
Trace; c0068d9c <proc_register+1c/f4>
Trace; c00173a8 <call_console_drivers+c0/160>
Trace; c00177cc <release_console_sem+84/130>
Trace; c0017630 <printk+170/1f0>
Trace; c0198294 <ip_rt_init+128/2c8> <--
Trace; c0198530 <ip_init+1c/5c>
Trace; c0198d74 <inet_init+f0/220>
Trace; c01896b0 <do_initcalls+30/50>
Trace; c0003b40 <init+28/200>
Trace; c000894c <arch_kernel_thread+30/3c>

Code:

int ip_route_input(struct sk_buff *skb, u32 daddr, u32 saddr,
u8 tos, struct net_device *dev)
{
struct rtable * rth;
unsigned hash;
int iif = dev->ifindex;

tos &= IPTOS_RT_MASK;
hash = rt_hash_code(daddr, saddr ^ (iif << 5), tos);

read_lock(&rt_hash_table[hash].lock); <-- oops here!!

[...]

void __init ip_rt_init(void)
{

[...]

rt_hash_table = (struct rt_hash_bucket *)
__get_free_pages(GFP_ATOMIC, order);
} while (rt_hash_table == NULL && --order > 0);

if (!rt_hash_table)
panic("Failed to allocate IP route cache hash table\n");

printk(KERN_INFO "IP: routing cache hash table of %u buckets, %ldKbytes\n",
rt_hash_mask,
(long) (rt_hash_mask * sizeof(struct rt_hash_bucket)) / 1024);

for (rt_hash_log = 0; (1 << rt_hash_log) != rt_hash_mask; rt_hash_log++)
/* NOTHING */;

rt_hash_mask--;
for (i = 0; i <= rt_hash_mask; i++) {
rt_hash_table[i].lock = RW_LOCK_UNLOCKED;
rt_hash_table[i].chain = NULL;
}


Is this a known problem?

What can be done to prevent this? Should the device be held inactive somehow until ip_rt_init has completed? Should I just move the printk to later in the code and hope for the best?

Any info gratefully received,

-Cam
--
camilo@xxxxxxxxxxxx <--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/