Re: [PATCH] kasan: improve double-free detection

From: Dmitry Vyukov
Date: Mon May 02 2016 - 07:36:05 EST

On Mon, May 2, 2016 at 1:30 PM, Luruo, Kuthonuzo
<kuthonuzo.luruo@xxxxxxx> wrote:
> Hi Dmitry,
> Thanks very much for your response/review.
>> I agree that it's something we need to fix (user-space ASAN does
>> something along these lines). My only concern is increase of
>> kasan_alloc_meta size. It's unnecessary large already and we have a
>> plan to reduce both alloc and free into to 16 bytes. However, it can
>> make sense to land this and then reduce size of headers in a separate
>> patch using a CAS-loop on state.
> ok.
>> Alexander, what's the state of your
>> patches that reduce header size?
>> I think there is another race. If we have racing frees, one thread
>> sets state to KASAN_STATE_QUARANTINE but does not fill
>> free_info->track yet, at this point another thread does free and
>> reports double-free, but the track is wrong so we will print a bogus
>> stack. The same is true for all other state transitions (e.g.
>> use-after-free racing with the object being pushed out of the
>> quarantine).
> Yes, I've noticed this too. For the double-free, in my local tests, the error
> reports from the bad frees get printed only after the successful "good"
> free set the new track info. But then, I was using kasan_report() on the
> error path to get sane reports from the multiple concurrent frees so that
> may have "slowed" down the error paths enough until the good free was
> done. Agreed, the window still exists for a stale (or missing) deallocation
> stack in error report.
>> We could introduce 2 helper functions like:
>> /* Sets status to KASAN_STATE_LOCKED if the current status is equal to
>> old_state, returns current state. Waits while current state equals
>> u32 kasan_lock_if_state_equals(struct kasan_alloc_meta *meta, u32 old_state);
>> /* Changes state from KASAN_STATE_LOCKED to new_state */
>> void kasan_unlock_and_set_status(struct kasan_alloc_meta *meta, u32
>> new_state);
>> Then free can be expressed as:
>> if (kasan_lock_if_state_equals(meta, KASAN_STATE_ALLOC) ==
>> free_info = get_free_info(cache, object);
>> quarantine_put(free_info, cache);
>> set_track(&free_info->track, GFP_NOWAIT);
>> kasan_poison_slab_free(cache, object);
>> kasan_unlock_and_set_status(meta, KASAN_STATE_QUARANTINE);
>> return true;
>> }
>> And on the reporting path we would need to lock the header, read all
>> state, unlock the header.
>> Does it make sense?
> It does, thanks! I'll try to hack up something, though right now, it's not entirely
> clear to me how to achieve this without resorting to a global lock (which would
> be unacceptable).

We can use per-header lock by setting status to KASAN_STATE_LOCKED. A
thread can CAS any status to KASAN_STATE_LOCKED which means that it
locked the header. If any thread tried to modify/read the status and
the status is KASAN_STATE_LOCKED, then the thread waits.

>> Please add the test as well. We need to start collecting tests for all
>> these tricky corner cases.
> Sure, a new test can be added for test_kasan.ko. Unlike the other tests, a
> double-free would likely panic the system due to slab corruption. Would it still
> be "KASANic" for kasan_slab_free() to return true after reporting double-free
> attempt error so thread will not call into __cache_free()? How does ASAN
> handle this?

Yes, sure, it is OK to return true from kasan_slab_free() in such case.
Use-space ASAN terminates the process after the first report. We've
decided that in kernel we better continue in best-effort manner. But
after the first report all bets are mostly off (leaking an object is
definitely OK).