Re: [PATCH 0/8] ARM: mvebu: Add support for RAID6 PQ offloading

From: Maxime Ripard
Date: Tue Jun 02 2015 - 10:45:36 EST


On Tue, May 26, 2015 at 09:31:03AM -0700, Dan Williams wrote:
> > If you mean, "give me a hand, you can start there", then yeah, I can
> > do that.
> >
> >> I'm not happy about not having had the time to do this rework myself.
> >> Linux is better off with this api deprecated.
> >
> > You're not talking about deprecating it, you're talking about removing
> > it entirely.
>
> True, and adding more users makes that removal more difficult. I'm
> willing to help out on the design and review for this work, I just
> can't commit to doing the implementation and testing.
>
> I think it looks something like this:
>
> At init time the raid456 driver probes for offload resources It can
> discover several scenarios:
>
> 1/ "the ioatdma case": raid channels that have all the necessary
> operations (copy, xor/pq, xor/pq check). In this case we'll never
> need to perform a channel switch. Potentially the cpu never touches
> the stripe cache in this case and we can maintain a static dma mapping
> for the entire lifespan of a struct stripe_head.
>
> 2/ "the channel switch case": All the necessary offload resources are
> available but span multiple devices. In this case we need to wait for
> channel1 to complete an operation before channel2 can start. This
> case is complicated by the fact that different channels may need their
> own dma mappings. In the simplest case channels can share the same
> mapping and raid456 needs to wait for channel completions. I think we
> can do a better job than the async_tx api here as raid456 should
> probably poll for completions after each stripe processing batch.
> Taking an interrupt per channel-switch event seems like excessive
> overhead.
>
> 3/ "the co-op case": We have a xor/pq offload resource, but copy and
> check operations require the cpu to touch the stripe cache. In this
> case we need to use the dma_sync_*_for_cpu()/dma_sync_*_for_device()
> to pass buffers back and forth between device and cpu ownership. This
> shares some of the complexity of waiting for completions with scenario
> 2.
>
> Which scenario does your implementation fall into? Maybe we can focus
> on that one and leave the other scenarios for other dmaengine
> maintainers to jump in an implement?

From my limited understanding of RAID and PQ computations, it would be
3 with a twist.

Our hardware controller supports xor and PQ, but the checks and
recovering data is not supported (we're not able to offload async_mult
and async_sum_product).

Maxime

--
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

Attachment: signature.asc
Description: Digital signature