RE: [PATCH V3 1/2] NTB: Add AMD PCI-Express NTB driver

From: Yu, Xiangliang
Date: Fri Jan 15 2016 - 02:38:08 EST

Next message: Junil Lee: "[PATCH v2] zsmalloc: fix migrate_zspage-zs_free race condition"
Previous message: Christian Borntraeger: "Re: regression 4.4: deadlock in with cgroup percpu_rwsem"
In reply to: Allen Hubbe: "RE: [PATCH V3 1/2] NTB: Add AMD PCI-Express NTB driver"
Next in thread: Allen Hubbe: "RE: [PATCH V3 1/2] NTB: Add AMD PCI-Express NTB driver"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

> > AMD NTB support following main features:
> > (1) Three memory windows;
> > (2) Sixteen 32-bit scratch pad;
> > (3) Two 16-bit doorbell interrupt;
>
> This is interesting. Let's submit your doorbell implementation as is, but I
> wonder if "multiple doorbell registers" is something we should expose in the
> ntb.h api. I wonder how we should reconcile that with "vectors", unless they
> are equivalent. If we see a device with more than 64 doorbell bits total, the
> ntb.h api will be inadequate.

Sorry, let me clarify it: one 16-bit doorbell is for primary side and the other is be used
By secondary side. The #3 should be is that two 16-bit doorbell registers.

> > (4) Five event interrupts;
> > (5) One system can wake up opposite system of NTB;
>
> You could say: power management support will be added in a following patch.

In here, i list all features AMD NTB support, not patch. Please see my word again.

> > (6) Flush previous request to the opposite system;
>
> ...in a following patch.

Ditto.

> > (7) There are reset and PME_TO mechanisms between two systems;
>
> How is this different from (5 or 6)? What is PME_TO?
I don't know what you mean for the different?
#5, #6 and #7 is hardware design:
#5: one side can wake up the opposite if the opposite is sleep;
#6 : one side flush the pending request to opposite side;
#7: if one side has reset or power off event, the opposite side can get the information
From the reset and PME_TO mechanism and execute some handler to avoid data lost.

PME_TO is PCIe message of power turn off.

> >
> > Signed-off-by: Xiangliang Yu <Xiangliang.Yu@xxxxxxx>
>
>
> Overall, this driver is looking much better. Thanks for your efforts!
>
> With the -rc1 window likely closing this weekend, let's see what others have
> to say about this patch v3, or getting a v4 this week.
> There are a few minor requests below. They are nothing critical.
>
> > +static void amd_ack_SMU(struct amd_ntb_dev *ndev, u32 bit) {
> > + void __iomem *mmio = ndev->self_mmio;
> > + int reg;
> > +
> > + reg = readl(mmio + AMD_SMUACK_OFFSET);
> > + reg |= bit;
> > + writel(reg, mmio + AMD_SMUACK_OFFSET);
> > +
> > + ndev->peer_sta |= bit;
> > +}
>
> What is SMU?

System management unit, please reference AMD manual.

> Does this have to do with power management, or flush? Can you say why it
> should be in this patch, and not one of the other ones?

This is AMD NTB hardware rules, I just following hardware design. One side can
convey the information of event changes to the other side by SMU ACK register.

>
> Speculation: Is it something like, if the peer driver has the flush patch, and
> this side is not patched, this side driver still needs to respond (ack) for the
> peer flush to complete? In other words, the flush is not fully implemented in
> hw? I'm just guessing, so I would like to hear your explanation.
>

I don't know what to say for your speculation, why not is software bug?
I donât think any hardware can work fine in this case.

> > +static void amd_handle_event(struct amd_ntb_dev *ndev, int vec) {
> > + void __iomem *mmio = ndev->self_mmio;
> > + u32 status;
> > +
> > + status = readl(mmio + AMD_INTSTAT_OFFSET);
> > + if (!(status & AMD_EVENT_INTMASK))
> > + return;
> > +
> > + dev_dbg(ndev_dev(ndev), "status = 0x%x and vec = %d\n", status,
> > vec);
> > +
> > + status &= AMD_EVENT_INTMASK;
> > + switch (status) {
> > + case AMD_PEER_FLUSH_EVENT:
> > + dev_info(ndev_dev(ndev), "Flush is done.\n");
> > + break;
> > + case AMD_PEER_RESET_EVENT:
> > + amd_ack_SMU(ndev, AMD_PEER_RESET_EVENT);
> > +
> > + /* link down first */
> > + ntb_link_event(&ndev->ntb);
> > + /* polling peer status */
> > + schedule_delayed_work(&ndev->hb_timer,
> AMD_LINK_HB_TIMEOUT);
> > +
> > + break;
> > + case AMD_PEER_D3_EVENT:
> > + case AMD_PEER_PMETO_EVENT:
> > + amd_ack_SMU(ndev, status);
>
> Here is ack SMU called.

Yes, follow hardware design.

> > +
> > + /* link down */
> > + ntb_link_event(&ndev->ntb);
> > +
> > + break;
> > + case AMD_PEER_D0_EVENT:
> > + mmio = ndev->peer_mmio;
> > + status = readl(mmio + AMD_PMESTAT_OFFSET);
> > + /* check if this is WAKEUP event */
> > + if (status & 0x1)
> > + dev_info(ndev_dev(ndev), "Wakeup is done.\n");
> > +
> > + amd_ack_SMU(ndev, AMD_PEER_D0_EVENT);
> > +
> > + if (amd_link_is_up(ndev))
> > + ntb_link_event(&ndev->ntb);
> > + else
> > + schedule_delayed_work(&ndev->hb_timer,
> > + AMD_LINK_HB_TIMEOUT);
> > + break;
> > + default:
> > + dev_info(ndev_dev(ndev), "event status = 0x%x.\n", status);
> > + break;
> > + }
> > +}
>
> > +enum {
>
> I would prefer to see #define, but this works... There are good arguments to
> be made for compiler constants vs the preprocessor. My preference just
> comes from what I've observed about how other drivers are written, and
> #define for this stuff seems to be the convention.

I have lots of SATA/Block layer experience, this following AHCI coding style.
Compiler will check enum variable, not for macro.
Actually, there are lots of similar definition in kernel, such as block layer code.

> > + /* AMD NTB Capability */
> > + AMD_MW_CNT = 3,
> > + AMD_DB_CNT = 16,
> > + AMD_MSIX_VECTOR_CNT = 24,
> > + AMD_SPADS_CNT = 16,
> > +
> > + /* AMD NTB register offset */
> > + AMD_CNTL_OFFSET = 0x200,
> > +
> > + /* NTB control register bits */
> > + PMM_REG_CTL = BIT(21),
> > + SMM_REG_CTL = BIT(20),
> > + SMM_REG_ACC_PATH = BIT(18),
> > + PMM_REG_ACC_PATH = BIT(17),
> > + NTB_CLK_EN = BIT(16),
> > +
> > + AMD_STA_OFFSET = 0x204,
> > + AMD_PGSLV_OFFSET = 0x208,
> > + AMD_SPAD_MUX_OFFSET = 0x20C,
> > + AMD_SPAD_OFFSET = 0x210,
> > + AMD_RSMU_HCID = 0x250,
> > + AMD_RSMU_SIID = 0x254,
> > + AMD_PSION_OFFSET = 0x300,
> > + AMD_SSION_OFFSET = 0x330,
> > + AMD_MMINDEX_OFFSET = 0x400,
> > + AMD_MMDATA_OFFSET = 0x404,
> > + AMD_SIDEINFO_OFFSET = 0x408,
> > +
> > + AMD_SIDE_MASK = BIT(0),
> > +
> > + /* limit register */
> > + AMD_ROMBARLMT_OFFSET = 0x410,
> > + AMD_BAR1LMT_OFFSET = 0x414,
> > + AMD_BAR23LMT_OFFSET = 0x418,
> > + AMD_BAR45LMT_OFFSET = 0x420,
> > + /* xlat address */
> > + AMD_POMBARXLAT_OFFSET = 0x428,
> > + AMD_BAR1XLAT_OFFSET = 0x430,
> > + AMD_BAR23XLAT_OFFSET = 0x438,
> > + AMD_BAR45XLAT_OFFSET = 0x440,
> > + /* doorbell and interrupt */
> > + AMD_DBFM_OFFSET = 0x450,
> > + AMD_DBREQ_OFFSET = 0x454,
> > + AMD_MIRRDBSTAT_OFFSET = 0x458,
> > + AMD_DBMASK_OFFSET = 0x45C,
> > + AMD_DBSTAT_OFFSET = 0x460,
> > + AMD_INTMASK_OFFSET = 0x470,
> > + AMD_INTSTAT_OFFSET = 0x474,
> > +
> > + /* event type */
> > + AMD_PEER_FLUSH_EVENT = BIT(0),
> > + AMD_PEER_RESET_EVENT = BIT(1),
> > + AMD_PEER_D3_EVENT = BIT(2),
> > + AMD_PEER_PMETO_EVENT = BIT(3),
> > + AMD_PEER_D0_EVENT = BIT(4),
> > + AMD_EVENT_INTMASK = (AMD_PEER_FLUSH_EVENT |
> > + AMD_PEER_RESET_EVENT |
> AMD_PEER_D3_EVENT |
> > + AMD_PEER_PMETO_EVENT |
> AMD_PEER_D0_EVENT),
> > +
> > + AMD_PMESTAT_OFFSET = 0x480,
> > + AMD_PMSGTRIG_OFFSET = 0x490,
> > + AMD_LTRLATENCY_OFFSET = 0x494,
> > + AMD_FLUSHTRIG_OFFSET = 0x498,
> > +
> > + /* SMU register*/
> > + AMD_SMUACK_OFFSET = 0x4A0,
> > + AMD_SINRST_OFFSET = 0x4A4,
> > + AMD_RSPNUM_OFFSET = 0x4A8,
> > + AMD_SMU_SPADMUTEX = 0x4B0,
> > + AMD_SMU_SPADOFFSET = 0x4B4,
> > +
> > + AMD_PEER_OFFSET = 0x400,
> > +};
>
> > +struct amd_ntb_dev {
> ...
> > +};
> > +
> > +#define ndev_pdev(ndev) ((ndev)->ntb.pdev) #define ndev_name(ndev)
> > +pci_name(ndev_pdev(ndev)) #define ndev_dev(ndev)
> > +(&ndev_pdev(ndev)->dev) #define ntb_ndev(__ntb)
> container_of(__ntb,
> > +struct amd_ntb_dev, ntb) #define hb_ndev(__work)
> container_of(__work,
> > +struct amd_ntb_dev,
> > hb_timer.work)
>
> Jon: I think these macros are good, and I like that they are here next to the
> struct amd_ntb_dev.
>
> > +static int amd_ntb_mw_count(struct ntb_dev *ntb); static int
> > +amd_ntb_mw_get_range(struct ntb_dev *ntb, int idx,
> > + phys_addr_t *base, resource_size_t *size,
> > + resource_size_t *align, resource_size_t *algin_size);
> static int
> > +amd_ntb_mw_set_trans(struct ntb_dev *ndev, int idx,
> > + dma_addr_t addr, resource_size_t size);
> static int
> > +amd_ntb_link_is_up(struct ntb_dev *ntb,
> > + enum ntb_speed *speed,
> > + enum ntb_width *width);
> > +static int amd_ntb_link_enable(struct ntb_dev *ntb,
> > + enum ntb_speed speed,
> > + enum ntb_width width);
> > +static int amd_ntb_link_disable(struct ntb_dev *ntb); static u64
> > +amd_ntb_db_valid_mask(struct ntb_dev *ntb); static int
> > +amd_ntb_db_vector_count(struct ntb_dev *ntb); static u64
> > +amd_ntb_db_vector_mask(struct ntb_dev *ntb, int db_vector); static
> > +u64 amd_ntb_db_read(struct ntb_dev *ntb); static int
> > +amd_ntb_db_clear(struct ntb_dev *ntb, u64 db_bits); static int
> > +amd_ntb_db_set_mask(struct ntb_dev *ntb, u64 db_bits); static int
> > +amd_ntb_db_clear_mask(struct ntb_dev *ntb, u64 db_bits); static int
> > +amd_ntb_peer_db_addr(struct ntb_dev *ntb,
> > + phys_addr_t *db_addr,
> > + resource_size_t *db_size);
> > +static int amd_ntb_peer_db_set(struct ntb_dev *ntb, u64 db_bits);
> > +static int amd_ntb_spad_count(struct ntb_dev *ntb); static u32
> > +amd_ntb_spad_read(struct ntb_dev *ntb, int idx); static int
> > +amd_ntb_spad_write(struct ntb_dev *ntb, int idx, u32 val); static int
> > +amd_ntb_peer_spad_addr(struct ntb_dev *ntb, int idx,
> > + phys_addr_t *spad_addr);
> > +static u32 amd_ntb_peer_spad_read(struct ntb_dev *ntb, int idx);
> > +static int amd_ntb_peer_spad_write(struct ntb_dev *ntb, int idx, u32
> > val);
>
> It's unusual, sometimes dangerous, to use the static keyword in a .h file
> (unless static inline, or things that you really intend to be static in multiple
> compilation unit). This .h is only included by a single .c file (effectively, this
> file is part of ntb_hw_amd.c), and these are only declarations, not function
> definitions, but it makes me nervous. Do you /need/ these functions
> forward declared? Could you rearrange your .c file so you don't need these
> forward declared? If you need them, could you put the declarations in the .c
> file?

Actually, I put it in .c file in first version, I think it make code clear if moving it to
.h file. Yes, I can remove all these definitions by rearranging .c file.

And please speak out all your concern in this time, you know, I had spent lots of time
On these patches.

Next message: Junil Lee: "[PATCH v2] zsmalloc: fix migrate_zspage-zs_free race condition"
Previous message: Christian Borntraeger: "Re: regression 4.4: deadlock in with cgroup percpu_rwsem"
In reply to: Allen Hubbe: "RE: [PATCH V3 1/2] NTB: Add AMD PCI-Express NTB driver"
Next in thread: Allen Hubbe: "RE: [PATCH V3 1/2] NTB: Add AMD PCI-Express NTB driver"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]