Re: [PATCH v10 2/3] tty: hvc: pass DMA capable memory to put_chars()

From: Xianting Tian
Date: Thu Oct 14 2021 - 04:56:12 EST



在 2021/10/14 下午4:41, Greg KH 写道:
On Thu, Oct 14, 2021 at 04:34:59PM +0800, Xianting Tian wrote:
在 2021/10/10 下午1:33, Greg KH 写道:
On Sat, Oct 09, 2021 at 11:45:23PM +0800, Xianting Tian wrote:
在 2021/10/9 下午7:58, Greg KH 写道:
Did you look at the placement using pahole as to how this structure now
looks?
thanks for all your commnts. for this one, do you mean I need to remove the
blank line?  thanks

No, I mean to use the tool 'pahole' to see the structure layout that you
just created and determine if it really is the best way to add these new
fields, especially as you are adding huge buffers with odd alignment.
thanks,

Based on your comments, I removed 'char outchar',  remian the position of
'int outbuf_size' unchanged to keep outbuf_size and lock in the same cache
line.  Now hvc_struct change as below,

 struct hvc_struct {
        struct tty_port port;
        spinlock_t lock;
        int index;
        int do_wakeup;
-       char *outbuf;
        int outbuf_size;
        int n_outbuf;
        uint32_t vtermno;
@@ -48,6 +57,16 @@ struct hvc_struct {
        struct work_struct tty_resize;
        struct list_head next;
        unsigned long flags;
+
+       /*
+        * the buf is used in hvc console api for putting chars,
+        * and also used in hvc_poll_put_char() for putting single char.
+        */
+       char cons_outbuf[N_OUTBUF] __ALIGNED__;
+       spinlock_t cons_outbuf_lock;
+
+       /* the buf is used for putting chars to tty */
+       char outbuf[] __ALIGNED__;
 };

pahole for above hvc_struct as below,  is it ok for you?  do we need to pack
the hole? thanks

struct hvc_struct {
    struct tty_port            port;                 /*     0 352 */
    /* --- cacheline 5 boundary (320 bytes) was 32 bytes ago --- */
    spinlock_t                 lock;                 /*   352 4 */
    int                        index;                /*   356 4 */
    int                        do_wakeup;            /*   360 4 */
    int                        outbuf_size;          /*   364 4 */
    int                        n_outbuf;             /*   368 4 */
    uint32_t                   vtermno;              /*   372 4 */
    const struct hv_ops  *     ops;                  /*   376 8 */
    /* --- cacheline 6 boundary (384 bytes) --- */
    int                        irq_requested;        /*   384 4 */
    int                        data;                 /*   388 4 */
    struct winsize             ws;                   /*   392 8 */
    struct work_struct         tty_resize;           /*   400 32 */
    struct list_head           next;                 /*   432 16 */
    /* --- cacheline 7 boundary (448 bytes) --- */
    long unsigned int          flags;                /*   448 8 */

    /* XXX 56 bytes hole, try to pack */

    /* --- cacheline 8 boundary (512 bytes) --- */
    char                       cons_outbuf[16];      /*   512 16 */
    spinlock_t                 cons_outbuf_lock;     /*   528 4 */

    /* XXX 44 bytes hole, try to pack */
Why not move the spinlock up above the cons_outbuf? Will that not be a
bit better?
thanks, I will move it avove cons_outbuf, and send v11 patches soon.

Anyway, this is all fine, and much better than before, thanks.

greg k-h