Re: [PATCH 2/3] mm: cma: introduce /proc/cmainfo

From: Gioh Kim
Date: Tue Dec 30 2014 - 20:14:59 EST

2014-12-29 ìí 11:09ì Stefan Strogin ì(ê) ì ê:
Thanks for review MichaÅ,

On 12/26/2014 07:02 PM, Michal Nazarewicz wrote:
On Fri, Dec 26 2014, "Stefan I. Strogin" <s.strogin@xxxxxxxxxxxxxxxxxxx> wrote:
/proc/cmainfo contains a list of currently allocated CMA buffers for every
CMA area when CONFIG_CMA_DEBUG is enabled.

Format is:

<base_phys_addr> - <end_phys_addr> (<size> kB), allocated by <PID>\
(<command name>), latency <allocation latency> us
<stack backtrace when the buffer had been allocated>

Signed-off-by: Stefan I. Strogin <s.strogin@xxxxxxxxxxxxxxxxxxx>
mm/cma.c | 202 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 202 insertions(+)

diff --git a/mm/cma.c b/mm/cma.c
index a85ae28..ffaea26 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -347,6 +372,86 @@ err:
return ret;

+ * cma_buffer_list_add() - add a new entry to a list of allocated buffers
+ * @cma: Contiguous memory region for which the allocation is performed.
+ * @pfn: Base PFN of the allocated buffer.
+ * @count: Number of allocated pages.
+ * @latency: Nanoseconds spent to allocate the buffer.
+ *
+ * This function adds a new entry to the list of allocated contiguous memory
+ * buffers in a CMA area. It uses the CMA area specificated by the device
+ * if available or the default global one otherwise.
+ */
+static int cma_buffer_list_add(struct cma *cma, unsigned long pfn,
+ int count, s64 latency)
+ struct cma_buffer *cmabuf;
+ struct stack_trace trace;
+ cmabuf = kmalloc(sizeof(struct cma_buffer), GFP_KERNEL);

cmabuf = kmalloc(sizeof *cmabuf, GFP_KERNEL);

cmabuf = kmalloc(sizeof(*cmabuf), GFP_KERNEL);

+ if (!cmabuf)
+ return -ENOMEM;
+ trace.nr_entries = 0;
+ trace.max_entries = ARRAY_SIZE(cmabuf->trace_entries);
+ trace.entries = &cmabuf->trace_entries[0];
+ trace.skip = 2;
+ save_stack_trace(&trace);
+ cmabuf->pfn = pfn;
+ cmabuf->count = count;
+ cmabuf->pid = task_pid_nr(current);
+ cmabuf->nr_entries = trace.nr_entries;
+ get_task_comm(cmabuf->comm, current);
+ cmabuf->latency = (unsigned int) div_s64(latency, NSEC_PER_USEC);
+ mutex_lock(&cma->list_lock);
+ list_add_tail(&cmabuf->list, &cma->buffers_list);
+ mutex_unlock(&cma->list_lock);
+ return 0;

Is it ok if the information is too big?
I'm not sure but I remember that seq_printf has 4K limitation.

So I made seq_operations with seq_list_start/next functions.


static void *debug_seq_start(struct seq_file *s, loff_t *pos)
 return seq_list_start(&debug_list, *pos);

static void debug_seq_stop(struct seq_file *s, void *data)
 struct debug_header *header = data;

 if (header == NULL || &header->head_list == &debug_list) {
  seq_printf(s, "end of info");


static void *debug_seq_next(struct seq_file *s, void *data, loff_t *pos)
 return seq_list_next(data, &debug_list, pos);

static int debug_seq_show(struct seq_file *sfile, void *data)
 struct debug_header *header;
 char *p;

 header= list_entry(data,
   struct debug_header,

 seq_printf(sfile, "print info");
 return 0;
static const struct seq_operations debug_seq_ops = {
 .start = debug_seq_start,
 .next = debug_seq_next,
 .stop = debug_seq_stop,
 .show = debug_seq_show,

You do not have guarantee that CMA deallocations will match allocations
exactly. User may allocate CMA region and then free it chunks. I'm not
saying that the debug code must handle than case but at least I would
like to see a comment describing this shortcoming.

Thanks, I'll fix it. If a number of released pages is less than there
were allocated then the list entry shouldn't be deleted, but it's fields
should be updated.

@@ -361,11 +466,15 @@ struct page *cma_alloc(struct cma *cma, int count, unsigned int align)
unsigned long mask, offset, pfn, start = 0;
unsigned long bitmap_maxno, bitmap_no, bitmap_count;
struct page *page = NULL;
+ struct timespec ts1, ts2;
+ s64 latency;
int ret;

if (!cma || !cma->count)
return NULL;

+ getnstimeofday(&ts1);

If CMA_DEBUG is disabled, you waste time on measuring latency. Either
use #ifdef or IS_ENABLED, e.g.:


Obviously! :)

@@ -413,6 +522,19 @@ struct page *cma_alloc(struct cma *cma, int count, unsigned int align)
start = bitmap_no + mask + 1;

+ getnstimeofday(&ts2);
+ latency = timespec_to_ns(&ts2) - timespec_to_ns(&ts1);
+ if (page) {

if (IS_ENABLED(CMA_DEBUG) && page) {
latency = timespec_to_ns(&ts2) - timespec_to_ns(&ts1);

+ ret = cma_buffer_list_add(cma, pfn, count, latency);

You could also change cma_buffer_list_add to take ts1 as an argument
instead of latency and then latency calculating would be hidden inside
of that function. Initialising ts1 should still be guarded with
IS_ENABLED of course.

if (IS_ENABLED(CMA_DEBUG) && page) {
latency = timespec_to_ns(&ts2) - timespec_to_ns(&ts1);

It seem to me this variant is better readable, thanks.

+ if (ret) {
+ pr_warn("%s(): cma_buffer_list_add() returned %d\n",
+ __func__, ret);
+ cma_release(cma, page, count);
+ page = NULL;

Harsh, but ok, if you want.

Excuse me, maybe you could suggest how to make a nicer fallback?
Or sure OK?

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at