Re: Kernel Routing sequence

From: Martin A. Brown
Date: Sun Aug 14 2005 - 17:23:25 EST



Greetings Al,

I can't speak for everybody else, but I'm quite unclear what problem you
are trying to solve. I have been watching this discussion, and figure I'd
try a stab at explaining a bit of the problem.

: Now:
: Host receives ping from 10.0.1.2/8 on 10.0.0.0/8 eth0
: Host replies to 10.0.1.2 using route 10.0.1.0/24 eth1.
:
: Host should have replied to 10.0.1.2 using route 10.0.0.0/8 eth0!
:
: Is it possible to instruct the Kernel to use the dest-mask instead of
: just letting it assume /32?

The kernel operates on a single packet at a time. I have made an attempt
at describing in English the decisions that the kernel makes when it
receives a packet [0].

Here are some givens which may help you:

- The kernel always selects a route in a routing table based on the
longest prefix match. (That may also explain to you why the routing
table is usually printed out starting with the most specific routes at
the top.)
- A single packet always has a single destination.
- In simplest terms, the selected route can be determined exclusively
from the longest prefix match of the destination IP.

When you use the command "ip route match $DEST", you are not asking the
kernel the kernel the same question you are asking us. When you ask for
"ip route match $DEST", you are asking the kernel to furnish you a list
of all potential matching routes to that destination. In fact, only a
single route will be selected for an individual packet, and that selected
route will always be for $DEST/32.

I think you'd have more luck in understanding what's happening by using
the "ip route get" command. This essentially traverses the kernel routing
decision. Several people in prior answers in this thread have mentioned
policy routing. Policy routing enables you to select among multiple
routing tables.

I note that you also mentioned "ESTABLISHED" before, and it was unclear
whether you meant ESTABLISHED in the netfilter context or ESTABLISHED in
terms of a TCP connection, but in either case, there is another part of
the kernel which is aware that this packet is part of a larger TCP
session. The routing code is completely unaware of this...you could
change this with policy routing (think netfilter fwmark and fwmark-based
routing, CONNMARK or similar such).

Good luck, Al.

-Martin

[0] http://linux-ip.net/html/routing-selection.html

--
Martin A. Brown --- SecurePipe, Inc. --- mabrown@xxxxxxxxxxxxxx

-
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html