Patchwork [084/115] x86, mm: avoid possible bogus tlb entries byclearing prev mm_cpumask after switching mm - additional question

From: Hebenstreit, Michael
Date: Thu Mar 24 2011 - 12:29:58 EST


Suresh Siddha described in https://patchwork.kernel.org/patch/564801/ a TLB related problem: "Clearing the cpu in prev's mm_cpumask early will avoid the flush tlb IPI's while the cr3 is still pointing to the prev mm. And this window can lead to the possibility of bogus TLB fills resulting in strange failures."

I was wondering if an error we saw with Lustre described in

https://bugzilla.redhat.com/show_bug.cgi?id=678175
and
http://jira.whamcloud.com/browse/LU-93

could be attributed to the TLB problem described in 564801? The essence is that during massive parallel I/O sometimes page->private (which is used by Lustre as a pointer to private data) would be set to 2. The only place I found where this could happen OUTSIDE normal Lustre code is free_hot_page (called by page_cache_release). But the page remained in the cache, and when a later readpage() was issued, this lead a NULL pointer kernel panic.

Architecture is x86_64, RH6 kernel 2.6.32-71.18.1.el6

thanks for any help
Michael

------------------------------------------------------------------------
Michael Hebenstreit Senior Cluster Architect
Intel Corporation Software and Services Group/HTE
2800 N Center Dr, DP3-307 Tel.: +1 253 371 3144
WA 98327, DuPont
UNITED STATES E-mail: michael.hebenstreit@xxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/