Re: [PATCH 2/3] mm: cma: introduce /proc/cmainfo

From: Stefan Strogin
Date: Mon Dec 29 2014 - 09:10:01 EST


Thanks for review MichaÅ,

On 12/26/2014 07:02 PM, Michal Nazarewicz wrote:
On Fri, Dec 26 2014, "Stefan I. Strogin" <s.strogin@xxxxxxxxxxxxxxxxxxx> wrote:
/proc/cmainfo contains a list of currently allocated CMA buffers for every
CMA area when CONFIG_CMA_DEBUG is enabled.

Format is:

<base_phys_addr> - <end_phys_addr> (<size> kB), allocated by <PID>\
(<command name>), latency <allocation latency> us
<stack backtrace when the buffer had been allocated>

Signed-off-by: Stefan I. Strogin <s.strogin@xxxxxxxxxxxxxxxxxxx>
---
mm/cma.c | 202 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 202 insertions(+)

diff --git a/mm/cma.c b/mm/cma.c
index a85ae28..ffaea26 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -347,6 +372,86 @@ err:
return ret;
}

+#ifdef CONFIG_CMA_DEBUG
+/**
+ * cma_buffer_list_add() - add a new entry to a list of allocated buffers
+ * @cma: Contiguous memory region for which the allocation is performed.
+ * @pfn: Base PFN of the allocated buffer.
+ * @count: Number of allocated pages.
+ * @latency: Nanoseconds spent to allocate the buffer.
+ *
+ * This function adds a new entry to the list of allocated contiguous memory
+ * buffers in a CMA area. It uses the CMA area specificated by the device
+ * if available or the default global one otherwise.
+ */
+static int cma_buffer_list_add(struct cma *cma, unsigned long pfn,
+ int count, s64 latency)
+{
+ struct cma_buffer *cmabuf;
+ struct stack_trace trace;
+
+ cmabuf = kmalloc(sizeof(struct cma_buffer), GFP_KERNEL);

cmabuf = kmalloc(sizeof *cmabuf, GFP_KERNEL);

cmabuf = kmalloc(sizeof(*cmabuf), GFP_KERNEL);


+ if (!cmabuf)
+ return -ENOMEM;
+
+ trace.nr_entries = 0;
+ trace.max_entries = ARRAY_SIZE(cmabuf->trace_entries);
+ trace.entries = &cmabuf->trace_entries[0];
+ trace.skip = 2;
+ save_stack_trace(&trace);
+
+ cmabuf->pfn = pfn;
+ cmabuf->count = count;
+ cmabuf->pid = task_pid_nr(current);
+ cmabuf->nr_entries = trace.nr_entries;
+ get_task_comm(cmabuf->comm, current);
+ cmabuf->latency = (unsigned int) div_s64(latency, NSEC_PER_USEC);
+
+ mutex_lock(&cma->list_lock);
+ list_add_tail(&cmabuf->list, &cma->buffers_list);
+ mutex_unlock(&cma->list_lock);
+
+ return 0;
+}
+
+/**
+ * cma_buffer_list_del() - delete an entry from a list of allocated buffers
+ * @cma: Contiguous memory region for which the allocation was performed.
+ * @pfn: Base PFN of the released buffer.
+ *
+ * This function deletes a list entry added by cma_buffer_list_add().
+ */
+static void cma_buffer_list_del(struct cma *cma, unsigned long pfn)
+{
+ struct cma_buffer *cmabuf;
+
+ mutex_lock(&cma->list_lock);
+
+ list_for_each_entry(cmabuf, &cma->buffers_list, list)
+ if (cmabuf->pfn == pfn) {
+ list_del(&cmabuf->list);
+ kfree(cmabuf);
+ goto out;
+ }

You do not have guarantee that CMA deallocations will match allocations
exactly. User may allocate CMA region and then free it chunks. I'm not
saying that the debug code must handle than case but at least I would
like to see a comment describing this shortcoming.

Thanks, I'll fix it. If a number of released pages is less than there
were allocated then the list entry shouldn't be deleted, but it's fields
should be updated.


@@ -361,11 +466,15 @@ struct page *cma_alloc(struct cma *cma, int count, unsigned int align)
unsigned long mask, offset, pfn, start = 0;
unsigned long bitmap_maxno, bitmap_no, bitmap_count;
struct page *page = NULL;
+ struct timespec ts1, ts2;
+ s64 latency;
int ret;

if (!cma || !cma->count)
return NULL;

+ getnstimeofday(&ts1);
+

If CMA_DEBUG is disabled, you waste time on measuring latency. Either
use #ifdef or IS_ENABLED, e.g.:

if (IS_ENABLED(CMA_DEBUG))
getnstimeofday(&ts1);

Obviously! :)


@@ -413,6 +522,19 @@ struct page *cma_alloc(struct cma *cma, int count, unsigned int align)
start = bitmap_no + mask + 1;
}

+ getnstimeofday(&ts2);
+ latency = timespec_to_ns(&ts2) - timespec_to_ns(&ts1);
+
+ if (page) {

if (IS_ENABLED(CMA_DEBUG) && page) {
getnstimeofday(&ts2);
latency = timespec_to_ns(&ts2) - timespec_to_ns(&ts1);

+ ret = cma_buffer_list_add(cma, pfn, count, latency);

You could also change cma_buffer_list_add to take ts1 as an argument
instead of latency and then latency calculating would be hidden inside
of that function. Initialising ts1 should still be guarded with
IS_ENABLED of course.

if (IS_ENABLED(CMA_DEBUG) && page) {
getnstimeofday(&ts2);
latency = timespec_to_ns(&ts2) - timespec_to_ns(&ts1);

It seem to me this variant is better readable, thanks.


+ if (ret) {
+ pr_warn("%s(): cma_buffer_list_add() returned %d\n",
+ __func__, ret);
+ cma_release(cma, page, count);
+ page = NULL;

Harsh, but ok, if you want.

Excuse me, maybe you could suggest how to make a nicer fallback?
Or sure OK?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/