Re: e1000e NVM corruption issue status

From: Jesse Barnes
Date: Fri Sep 26 2008 - 14:44:01 EST


On Friday, September 26, 2008 11:39 am Jesse Barnes wrote:
> On Friday, September 26, 2008 11:23 am Jesse Barnes wrote:
> > On Friday, September 26, 2008 10:52 am Jesse Barnes wrote:
> > > On Friday, September 26, 2008 4:49 am Arjan van de Ven wrote:
> > > > Jiri Kosina wrote:
> > > > > On Thu, 25 Sep 2008, Brandeburg, Jesse wrote:
> > > > >> this is the current set of patches that I have to help us debug
> > > > >> and/or fix e1000e issues found during this debug effort for
> > > > >> the corrupt NVM. the "drop stats lock" - "reset swflag" patches
> > > > >> allow Thomas' patch for a mutex in the SWFLAG acquire function to
> > > > >> run without any errors.
> > > > >
> > > > > Thanks. Also Jesse Barnes' patch shouldn't be forgotten, could you
> > > > > please add it to that lineup?
> > > > >
> > > > > http://marc.info/?l=linux-kernel&m=122237193628087&w=2
> > > >
> > > > can we (for now) also stick a WARN_ON() into that failure path? that
> > > > way we can at least catch if/when this happens more visibly..... if
> > > > it happens consistently in say the new distros we can be more
> > > > confident that we're down the right path in diagnosing the issue.
> > >
> > > I'm spinning a new one now with some debug output, stay tuned (just
> > > gotta boot my test box).
> >
> > Ok here's an updated one. Jesse (Br) can you add it to your list? If
> > the X driver really is mapping too much this should catch it, as long as
> > it goes through sysfs.
>
> Arjan pointed out I may as well just use WARN() these days. Updated patch
> attached.

Even better use WARN() correctly.

--
Jesse Barnes, Intel Open Source Technology Center
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 9c71858..4d1aa6e 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -16,6 +16,7 @@


#include <linux/kernel.h>
+#include <linux/sched.h>
#include <linux/pci.h>
#include <linux/stat.h>
#include <linux/topology.h>
@@ -502,6 +503,8 @@ pci_mmap_resource(struct kobject *kobj, struct bin_attribute *attr,
struct resource *res = (struct resource *)attr->private;
enum pci_mmap_state mmap_type;
resource_size_t start, end;
+ unsigned long map_len = vma->vm_end - vma->vm_start;
+ unsigned long map_offset = vma->vm_pgoff << PAGE_SHIFT;
int i;

for (i = 0; i < PCI_ROM_RESOURCE; i++)
@@ -510,6 +513,17 @@ pci_mmap_resource(struct kobject *kobj, struct bin_attribute *attr,
if (i >= PCI_ROM_RESOURCE)
return -ENODEV;

+ /*
+ * Make sure the range the user is trying to map falls within
+ * the resource
+ */
+ if (map_offset + map_len > pci_resource_len(pdev, i)) {
+ WARN(1, "process \"%s\" tried to map 0x%08lx-0x%08lx on BAR %d (size 0x%08lx)\n",
+ current->comm, map_offset, map_offset + map_len, i,
+ (unsigned long)pci_resource_len(pdev, i));
+ return -EINVAL;
+ }
+
/* pci_mmap_page_range() expects the same kind of entry as coming
* from /proc/bus/pci/ which is a "user visible" value. If this is
* different from the resource itself, arch will do necessary fixup.