You can page out page tables, that's just a normal "page missing" type
thing. Linux doesn't do it, because it adds a lot of races, and I don't
think the complexity is worth it for the memory you can (potentially) get
back. I generally try to optimize for the "there is enough memory" case
anyway.
What I mean when I say that the page table lookups are done using
physical addressing is just that all the page tables contain physical
page addresses, and the actual logic that does the virtual->physical
translation never enters itself "recursively" - it does all the lookups
using the physical addresses found in the page tables.
Thus the actual lookup can never result in a page fault (but if the
lookup doesn't find a present page table, that obviously _will_ fault,
but then it's the lookup logic that expressly requests the fault, not the
"action of looking it up" that faults, if you see the difference.
In contrast, the m68k seems to use virtual page tables for some things at
least, and the alpha PAL-code uses a virtual page table lookup to speed
up normal cases. Then the actual page table lookup can fault, so that you
get a "two-level" fault handler type setup - the "normal" fault handler,
and the "double-page-fault" handler that handles the cases where the page
fault handling logic itself page faulted.
(The x86 can also get a so-called "double fault", but that doesn't mean
that the hardware translation itself faulted, but that the software
handler that was supposed to handle a fault can't be found for some reason.
For example, the page fault handler if paged out ;-).
Linus