On Thu, Jan 20, 2022 at 12:22:35PM +0000, Robin Murphy wrote:
On 2022-01-19 19:12, Russell King (Oracle) wrote:
On Wed, Jan 19, 2022 at 06:43:10PM +0000, Robin Murphy wrote:
Indeed, my impression is that the only legitimate way to get hold of a page
pointer without assumed provenance is via pfn_to_page(), which is where
pfn_valid() comes in. Thus pfn_valid(page_to_pfn()) really *should* be a
tautology.
That can only be true if pfn == page_to_pfn(pfn_to_page(pfn)) for all
values of pfn.
Given how pfn_to_page() is defined in the sparsemem case:
#define __pfn_to_page(pfn) \
({ unsigned long __pfn = (pfn); \
struct mem_section *__sec = __pfn_to_section(__pfn); \
__section_mem_map_addr(__sec) + __pfn; \
})
#define page_to_pfn __page_to_pfn
that isn't the case, especially when looking at page_to_pfn():
#define __page_to_pfn(pg) \
({ const struct page *__pg = (pg); \
int __sec = page_to_section(__pg); \
(unsigned long)(__pg - __section_mem_map_addr(__nr_to_section(__sec))); \
})
Where:
static inline unsigned long page_to_section(const struct page *page)
{
return (page->flags >> SECTIONS_PGSHIFT) & SECTIONS_MASK;
}
So if page_to_section() returns something that is, e.g. zero for an
invalid page in a non-zero section, you're not going to end up with
the right pfn from page_to_pfn().
Right, I emphasised "should" in an attempt to imply "in the absence of
serious bugs that have further-reaching consequences anyway".
As I've said now a couple of times, trying to determine of a struct
page pointer is valid is the wrong question to be asking.
And doing so in one single place, on the justification of avoiding an
incredibly niche symptom, is even more so. Not to mention that an address
size fault is one of the best possible outcomes anyway, vs. the untold
damage that may stem from accesses actually going through to random parts of
the physical memory map.
I don't see it as a "niche" symptom.
If we start off with the struct page being invalid, then the result of
page_to_pfn() can not be relied upon to produce something that is
meaningful - which is exactly why the vmap() issue arises.
With a pfn_valid() check, we at least know that the PFN points at
memory.
However, that memory could be _anything_ in the system - it
could be the kernel image, and it could give userspace access to
change kernel code.
So, while it is useful to do a pfn_valid() check in vmap(), as I said
to willy, this must _not_ be the primary check. It should IMHO use
WARN_ON() to make it blatently obvious that it should be something we
expect _not_ to trigger under normal circumstances, but is there to
catch programming errors elsewhere.