Re: [PATCH] dm cache: fix race affecting dirty block count

From: Pranith Kumar
Date: Sun Aug 03 2014 - 00:46:35 EST


On 08/02/2014 10:10 PM, Pranith Kumar wrote:
> Corrently adding Anssi this time.
>
> On Sat, Aug 2, 2014 at 10:00 PM, Pranith Kumar <bobby.prani@xxxxxxxxx> wrote:
>> Hello Anssi, Joe, Mike,
>>
>> I just found your patch in the latest rc and wanted to ask a few
>> questions. I am not sure how this patch helps at all other than luck in
>> that dm_cblock_t type is of type int32_t, which should guarantee that it
>> is atomic on most platforms. Which begs the question, what platform did
>> you encounter this problem?
>>
>> The patch purports to solve a race condition by making use of atomic_t.
>> I am not sure that is enough. If indeed there is a race you need to use
>> smp_mb__{before/after}_atomic() for both your uses of atomic_inc() and
>> atomic_set().
>>
>> Also I have a concern about why this mail was not CC'ed on LKML. I had
>> to go to some difficulty in finding this patch. So please CC LKML for
>> such patches.
>>
>> Thanks,
>> --
>> Pranith
>>
>> -- Begin forwarded Message --
>>
>>
>> nr_dirty is updated without locking, causing it to drift so that it is
>> non-zero (either a small positive integer, or a very large one when an
>> underflow occurs) even when there are no actual dirty blocks.
>>
>> Fix that by using an atomic type for nr_dirty.
>>
>> Signed-off-by: Anssi Hannula <anssi hannula iki fi>
>> Cc: Joe Thornber <ejt redhat com>
>> Cc: stable vger kernel org

Well, someone got their pointers mixed up. The following should help, but this is not enough if there are indeed concurrent accesses. Are you sure there are concurrent accesses?

diff --git a/drivers/md/dm-cache-target.c b/drivers/md/dm-cache-target.c
index 2c63326..d49ce45 100644
--- a/drivers/md/dm-cache-target.c
+++ b/drivers/md/dm-cache-target.c
@@ -1765,11 +1765,11 @@ static void destroy(struct cache *cache)
if (cache->wq)
destroy_workqueue(cache->wq);

- if (cache->dirty_bitset)
- free_bitset(cache->dirty_bitset);
+ if (cache.dirty_bitset)
+ free_bitset(cache.dirty_bitset);

- if (cache->discard_bitset)
- free_bitset(cache->discard_bitset);
+ if (cache.discard_bitset)
+ free_bitset(cache.discard_bitset);

if (cache->copier)
dm_kcopyd_client_destroy(cache->copier);
@@ -2269,15 +2269,15 @@ static int cache_create(struct cache_args *ca, struct cache **result)

r = -ENOMEM;
atomic_set(&cache->nr_dirty, 0);
- cache->dirty_bitset = alloc_bitset(from_cblock(cache->cache_size));
- if (!cache->dirty_bitset) {
+ cache.dirty_bitset = alloc_bitset(from_cblock(cache->cache_size));
+ if (!cache.dirty_bitset) {
*error = "could not allocate dirty bitset";
goto bad;
}
clear_bitset(cache->dirty_bitset, from_cblock(cache->cache_size));

cache->discard_nr_blocks = cache->origin_blocks;
- cache->discard_bitset = alloc_bitset(from_oblock(cache->discard_nr_blocks));
+ cache.discard_bitset = alloc_bitset(from_oblock(cache->discard_nr_blocks));
if (!cache->discard_bitset) {
*error = "could not allocate discard bitset";
goto bad;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/