Re: Bricked x86 CPU with software?

From: Hector Martin 'marcan'
Date: Thu Jan 04 2018 - 20:34:55 EST

Next message: Dave Airlie: "[git pull] drm fixes for 4.15-rc7"
Previous message: Trond Myklebust: "Re: [PATCH/RFC] NFS: add nostatflush mount option."
In reply to: Tim Mouraveiko: "Re: Bricked x86 CPU with software?"
Next in thread: Tim Mouraveiko: "Re: Bricked x86 CPU with software?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 2018-01-05 10:21, Tim Mouraveiko wrote:
>> On Thu 2018-01-04 14:13:56, Tim Mouraveiko wrote:
>> Actually... I don't think your code works. That's why I'm curious. But
>> if it works, its rather a big news... and I'm sure Intel and cloud
>> providers are going to be interested.
>>
>
> I first discovered this issue over a year ago, quite by accident. I changed the code I was
> working on so as not to kill the CPU (as that is not what I was trying to). We made Intel aware
> of it. They didnÂt care much, one of their personnel suggesting that they already knew about it
> (whether this is true or not I couldnÂt say). It popped up again later, so I had to fix the code
> again. It could be a buggy implementation of a certain x86 functionality, but I left it at that
> because I had better things to do with my time.
>
> Now this news came up about meltdown and spectre and I was curious if anyone else had
> experienced a dead CPU by software, too. Meltdown and spectre are undeniably a problem,
> but the magnitude and practicality of it is questionable.
>
> I suspect that what I discovered is either a kill switch, an unintentional flaw that was
> implemented at the time the original feature was built into x86 functionality and kept
> propagating through successive generations of processors, or could well be that I have a
> very destructive and targeted solar flare that is after my CPUs. So, I figured I would put the
> question out there, to see if anyone else had a similar experience. Putting the solar flare idea
> aside, I canÂt conclusively say whether it is a flaw or a feature. Both options are supported at
> this time by my observations of the CPU behavior.
>

If you made Intel aware of the issue a year ago, and they weren't
interested, then the responsible thing to do is disclose the problem
publicly. This is a security issue (if trusted code can brick a CPU,
it's an issue for bare metal hosting providers; if untrusted code can
brick a CPU, it's a *huge* issue for every cloud provider and many, many
others who run code in various sandboxes). If the vendor is not
receptive to coordinated disclosure, the only option is public
disclosure to at least make people aware of the problem and allow for
mitigations to be developed, if possible.

Personally, I would be very interested in seeing such code. We've seen
several ways to brick nonvolatile firmware (writable BIOSes, bad CMOS
data, etc.), but bricking a CPU is a first. The only way that can happen
is either blowing a kill fuse, or causing actual hardware damage, since
CPUs have no nonvolatile memory other than fuses. Either way this would
be a very interesting result.

--
Hector Martin "marcan" (marcan@xxxxxxxxx)
Public Key: https://mrcn.st/pub

Next message: Dave Airlie: "[git pull] drm fixes for 4.15-rc7"
Previous message: Trond Myklebust: "Re: [PATCH/RFC] NFS: add nostatflush mount option."
In reply to: Tim Mouraveiko: "Re: Bricked x86 CPU with software?"
Next in thread: Tim Mouraveiko: "Re: Bricked x86 CPU with software?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]