Re: [PATCH 7/8] zswap: add to mm/

From: Seth Jennings
Date: Mon Jan 07 2013 - 09:47:49 EST

On 01/04/2013 04:45 PM, Dan Magenheimer wrote:
>> From: Seth Jennings [mailto:sjenning@xxxxxxxxxxxxxxxxxx]
>> Subject: Re: [PATCH 7/8] zswap: add to mm/
>> On 01/03/2013 04:33 PM, Dan Magenheimer wrote:
>>>> From: Seth Jennings [mailto:sjenning@xxxxxxxxxxxxxxxxxx]
>>>> However, once the flushing code was introduced and could free an entry
>>>> from the zswap_fs_store() path, it became necessary to add a per-entry
>>>> refcount to make sure that the entry isn't freed while another code
>>>> path was operating on it.
>>> Hmmm... doesn't the refcount at least need to be an atomic_t?
>> An entry's refcount is only ever changed under the tree lock, so
>> making them atomic_t would be redundantly atomic.
> Maybe I'm missing something still but then I think you also
> need to evaluate and act on the refcount (not just read it) while
> your treelock is held. I.e., in:
>> + /* page is already in the swap cache, ignore for now */
>> + spin_lock(&tree->lock);
>> + refcount = zswap_entry_put(entry);
>> + spin_unlock(&tree->lock);
>> +
>> + if (likely(refcount))
>> + return 0;
>> +
>> + /* if the refcount is zero, invalidate must have come in */
>> + /* free */
>> + zs_free(tree->pool, entry->handle);
>> + zswap_entry_cache_free(entry);
>> + atomic_dec(&zswap_stored_pages);
> the entry's refcount may be changed by another processor
> immediately after the unlock, and then the "if (refcount)"
> is testing a stale value and you will get (I think) a memory leak.

It is true that the refcount could be stale by the time we do the
check. However, all functions that do a zswap_entry_put(), which
potentially drops the refcount to 0, check the refcount and free the
entry if they need to. All the functions that do a zswap_entry_put()
that result in the refcount being 0 also ensure that there is no way
for another thread to gain a reference to entry by either the tree or
lru list before releasing the lock. That way the cleanup can happen
outside the lock with the risk of someone gaining access to the entry
being freed in the meantime.

> A nit: Even I, steeped in tmem terminology, was confused by
> your use of "fs"... to nearly all readers it will
> be translated as "filesystem" which is mystifying.
> Just spell it out "frontswap", even if it causes a few
> lines to be wrapped.

Sound good. I'll queue it up.


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at