Re: Delaying writes to disk when there's no need

From: Andrew Morton (akpm@digeo.com)
Date: Mon Mar 31 2003 - 20:09:27 EST


Daniel Pittman <daniel@rimspace.net> wrote:
>
> Capturing a real-time video stream from an IEEE1394 DV stream means
> writing a stead 3.5MB per second for two on two and a half hours.
>
> Linux isn't great at this, using the default writeout policy, even as
> recent as 2.5.64. The writer goes OK for a while but, eventually, blocks
> on writeout for long enough to drop a frame -- more than 8/25ths of a
> second.
>
>
> This can be resolved by tuning the default delay before write-out start
> to 5 seconds, down from 30, or by running sync every second, or by doing
> fsync tricks.

Interesting.

Yes, I expect that you could fix that up by altering dirty_background_ratio
and dirty_expire_centisecs.

The problem with fsync() is that it waits on the writeout. You don't want
that to happen - you just want to tell the kernel "I won't be overwriting or
deleting this data". Make the kernel queue up and start the IO but not wait
on its completion.

It is quite appropriate to do this in fadvise(FADV_DONTNEED) - as a
lower-latency fsync(). The app would need to call it once per second or so.

It would also throw away any written-back pagecache inside your (start, len)
which is exactly what your applications wants to happen, so the app should be
calling fadvise _anyway_.

What do you think?

 25-akpm/include/linux/fs.h | 1 +
 25-akpm/mm/fadvise.c | 1 +
 25-akpm/mm/filemap.c | 18 ++++++++++++++++--
 3 files changed, 18 insertions(+), 2 deletions(-)

diff -puN include/linux/fs.h~fadvise-flush-data include/linux/fs.h
--- 25/include/linux/fs.h~fadvise-flush-data Mon Mar 31 17:03:39 2003
+++ 25-akpm/include/linux/fs.h Mon Mar 31 17:03:39 2003
@@ -1112,6 +1112,7 @@ unsigned long invalidate_inode_pages(str
 extern void invalidate_inode_pages2(struct address_space *mapping);
 extern void write_inode_now(struct inode *, int);
 extern int filemap_fdatawrite(struct address_space *);
+extern int filemap_flush(struct address_space *);
 extern int filemap_fdatawait(struct address_space *);
 extern void sync_supers(void);
 extern void sync_filesystems(int wait);
diff -puN mm/fadvise.c~fadvise-flush-data mm/fadvise.c
--- 25/mm/fadvise.c~fadvise-flush-data Mon Mar 31 17:03:39 2003
+++ 25-akpm/mm/fadvise.c Mon Mar 31 17:03:39 2003
@@ -61,6 +61,7 @@ long sys_fadvise64(int fd, loff_t offset
                         ret = 0;
                 break;
         case POSIX_FADV_DONTNEED:
+ filemap_flush(mapping);
                 invalidate_mapping_pages(mapping, offset >> PAGE_CACHE_SHIFT,
                                 (len >> PAGE_CACHE_SHIFT) + 1);
                 break;
diff -puN mm/filemap.c~fadvise-flush-data mm/filemap.c
--- 25/mm/filemap.c~fadvise-flush-data Mon Mar 31 17:03:39 2003
+++ 25-akpm/mm/filemap.c Mon Mar 31 17:03:39 2003
@@ -122,11 +122,11 @@ static inline int sync_page(struct page
  * if a dirty page/buffer is encountered, it must be waited upon, and not just
  * skipped over.
  */
-int filemap_fdatawrite(struct address_space *mapping)
+static int __filemap_fdatawrite(struct address_space *mapping, int sync_mode)
 {
         int ret;
         struct writeback_control wbc = {
- .sync_mode = WB_SYNC_ALL,
+ .sync_mode = sync_mode,
                 .nr_to_write = mapping->nrpages * 2,
         };
 
@@ -140,6 +140,20 @@ int filemap_fdatawrite(struct address_sp
         return ret;
 }
 
+int filemap_fdatawrite(struct address_space *mapping)
+{
+ return __filemap_fdatawrite(mapping, WB_SYNC_ALL);
+}
+
+/*
+ * This is a mostly non-blocking flush. Not suitable for data-integrity
+ * purposes.
+ */
+int filemap_flush(struct address_space *mapping)
+{
+ return __filemap_fdatawrite(mapping, WB_SYNC_NONE);
+}
+
 /**
  * filemap_fdatawait - walk the list of locked pages of the given address
  * space and wait for all of them.

_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Mar 31 2003 - 22:00:38 EST