Re: [Ocfs2-devel] [PATCH 01/35] jbd2: Add buffer triggers

From: Mark Fasheh
Date: Wed Dec 31 2008 - 13:16:39 EST


Hey Ted,

I just wanted to double check that the latest version of this patch
is ok with you before I send it off to Linus. IMHO, it should be fine - the
previous version got your Acked-by, and this just represents the final,
bug-fixed product. Still, it's good to make sure we're all on the same page :)

Thanks,
--Mark

On Thu, Dec 25, 2008 at 10:04:16AM -0800, Mark Fasheh wrote:
> From: Joel Becker <joel.becker@xxxxxxxxxx>
>
> Filesystems often to do compute intensive operation on some
> metadata. If this operation is repeated many times, it can be very
> expensive. It would be much nicer if the operation could be performed
> once before a buffer goes to disk.
>
> This adds triggers to jbd2 buffer heads. Just before writing a metadata
> buffer to the journal, jbd2 will optionally call a commit trigger associated
> with the buffer. If the journal is aborted, an abort trigger will be
> called on any dirty buffers as they are dropped from pending
> transactions.
>
> ocfs2 will use this feature.
>
> Initially I tried to come up with a more generic trigger that could be
> used for non-buffer-related events like transaction completion. It
> doesn't tie nicely, because the information a buffer trigger needs
> (specific to a journal_head) isn't the same as what a transaction
> trigger needs (specific to a tranaction_t or perhaps journal_t). So I
> implemented a buffer set, with the understanding that
> journal/transaction wide triggers should be implemented separately.
>
> There is only one trigger set allowed per buffer. I can't think of any
> reason to attach more than one set. Contrast this with a journal or
> transaction in which multiple places may want to watch the entire
> transaction separately.
>
> The trigger sets are considered static allocation from the jbd2
> perspective. ocfs2 will just have one trigger set per block type,
> setting the same set on every bh of the same type.
>
> Signed-off-by: Joel Becker <joel.becker@xxxxxxxxxx>
> Cc: "Theodore Ts'o" <tytso@xxxxxxx>
> Cc: <linux-ext4@xxxxxxxxxxxxxxx>
> Signed-off-by: Mark Fasheh <mfasheh@xxxxxxxx>
> ---
> fs/jbd2/commit.c | 9 ++++++++
> fs/jbd2/journal.c | 19 +++++++++++++++++
> fs/jbd2/transaction.c | 47 ++++++++++++++++++++++++++++++++++++++++++
> include/linux/jbd2.h | 31 +++++++++++++++++++++++++++
> include/linux/journal-head.h | 8 +++++++
> 5 files changed, 114 insertions(+), 0 deletions(-)
>
> diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
> index ebc667b..c8a1bac 100644
> --- a/fs/jbd2/commit.c
> +++ b/fs/jbd2/commit.c
> @@ -509,6 +509,10 @@ void jbd2_journal_commit_transaction(journal_t *journal)
> if (is_journal_aborted(journal)) {
> clear_buffer_jbddirty(jh2bh(jh));
> JBUFFER_TRACE(jh, "journal is aborting: refile");
> + jbd2_buffer_abort_trigger(jh,
> + jh->b_frozen_data ?
> + jh->b_frozen_triggers :
> + jh->b_triggers);
> jbd2_journal_refile_buffer(journal, jh);
> /* If that was the last one, we need to clean up
> * any descriptor buffers which may have been
> @@ -844,6 +848,9 @@ restart_loop:
> * data.
> *
> * Otherwise, we can just throw away the frozen data now.
> + *
> + * We also know that the frozen data has already fired
> + * its triggers if they exist, so we can clear that too.
> */
> if (jh->b_committed_data) {
> jbd2_free(jh->b_committed_data, bh->b_size);
> @@ -851,10 +858,12 @@ restart_loop:
> if (jh->b_frozen_data) {
> jh->b_committed_data = jh->b_frozen_data;
> jh->b_frozen_data = NULL;
> + jh->b_frozen_triggers = NULL;
> }
> } else if (jh->b_frozen_data) {
> jbd2_free(jh->b_frozen_data, bh->b_size);
> jh->b_frozen_data = NULL;
> + jh->b_frozen_triggers = NULL;
> }
>
> spin_lock(&journal->j_list_lock);
> diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
> index e70d657..f6bff9d 100644
> --- a/fs/jbd2/journal.c
> +++ b/fs/jbd2/journal.c
> @@ -50,6 +50,7 @@ EXPORT_SYMBOL(jbd2_journal_unlock_updates);
> EXPORT_SYMBOL(jbd2_journal_get_write_access);
> EXPORT_SYMBOL(jbd2_journal_get_create_access);
> EXPORT_SYMBOL(jbd2_journal_get_undo_access);
> +EXPORT_SYMBOL(jbd2_journal_set_triggers);
> EXPORT_SYMBOL(jbd2_journal_dirty_metadata);
> EXPORT_SYMBOL(jbd2_journal_release_buffer);
> EXPORT_SYMBOL(jbd2_journal_forget);
> @@ -290,6 +291,7 @@ int jbd2_journal_write_metadata_buffer(transaction_t *transaction,
> struct page *new_page;
> unsigned int new_offset;
> struct buffer_head *bh_in = jh2bh(jh_in);
> + struct jbd2_buffer_trigger_type *triggers;
>
> /*
> * The buffer really shouldn't be locked: only the current committing
> @@ -314,13 +316,23 @@ repeat:
> done_copy_out = 1;
> new_page = virt_to_page(jh_in->b_frozen_data);
> new_offset = offset_in_page(jh_in->b_frozen_data);
> + triggers = jh_in->b_frozen_triggers;
> } else {
> new_page = jh2bh(jh_in)->b_page;
> new_offset = offset_in_page(jh2bh(jh_in)->b_data);
> + triggers = jh_in->b_triggers;
> }
>
> mapped_data = kmap_atomic(new_page, KM_USER0);
> /*
> + * Fire any commit trigger. Do this before checking for escaping,
> + * as the trigger may modify the magic offset. If a copy-out
> + * happens afterwards, it will have the correct data in the buffer.
> + */
> + jbd2_buffer_commit_trigger(jh_in, mapped_data + new_offset,
> + triggers);
> +
> + /*
> * Check for escaping
> */
> if (*((__be32 *)(mapped_data + new_offset)) ==
> @@ -352,6 +364,13 @@ repeat:
> new_page = virt_to_page(tmp);
> new_offset = offset_in_page(tmp);
> done_copy_out = 1;
> +
> + /*
> + * This isn't strictly necessary, as we're using frozen
> + * data for the escaping, but it keeps consistency with
> + * b_frozen_data usage.
> + */
> + jh_in->b_frozen_triggers = jh_in->b_triggers;
> }
>
> /*
> diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
> index 39b7805..4f925a4 100644
> --- a/fs/jbd2/transaction.c
> +++ b/fs/jbd2/transaction.c
> @@ -741,6 +741,12 @@ done:
> source = kmap_atomic(page, KM_USER0);
> memcpy(jh->b_frozen_data, source+offset, jh2bh(jh)->b_size);
> kunmap_atomic(source, KM_USER0);
> +
> + /*
> + * Now that the frozen data is saved off, we need to store
> + * any matching triggers.
> + */
> + jh->b_frozen_triggers = jh->b_triggers;
> }
> jbd_unlock_bh_state(bh);
>
> @@ -944,6 +950,47 @@ out:
> }
>
> /**
> + * void jbd2_journal_set_triggers() - Add triggers for commit writeout
> + * @bh: buffer to trigger on
> + * @type: struct jbd2_buffer_trigger_type containing the trigger(s).
> + *
> + * Set any triggers on this journal_head. This is always safe, because
> + * triggers for a committing buffer will be saved off, and triggers for
> + * a running transaction will match the buffer in that transaction.
> + *
> + * Call with NULL to clear the triggers.
> + */
> +void jbd2_journal_set_triggers(struct buffer_head *bh,
> + struct jbd2_buffer_trigger_type *type)
> +{
> + struct journal_head *jh = bh2jh(bh);
> +
> + jh->b_triggers = type;
> +}
> +
> +void jbd2_buffer_commit_trigger(struct journal_head *jh, void *mapped_data,
> + struct jbd2_buffer_trigger_type *triggers)
> +{
> + struct buffer_head *bh = jh2bh(jh);
> +
> + if (!triggers || !triggers->t_commit)
> + return;
> +
> + triggers->t_commit(triggers, bh, mapped_data, bh->b_size);
> +}
> +
> +void jbd2_buffer_abort_trigger(struct journal_head *jh,
> + struct jbd2_buffer_trigger_type *triggers)
> +{
> + if (!triggers || !triggers->t_abort)
> + return;
> +
> + triggers->t_abort(triggers, jh2bh(jh));
> +}
> +
> +
> +
> +/**
> * int jbd2_journal_dirty_metadata() - mark a buffer as containing dirty metadata
> * @handle: transaction to add buffer to.
> * @bh: buffer to mark
> diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
> index f366457..3445647 100644
> --- a/include/linux/jbd2.h
> +++ b/include/linux/jbd2.h
> @@ -1008,6 +1008,35 @@ int __jbd2_journal_clean_checkpoint_list(journal_t *journal);
> int __jbd2_journal_remove_checkpoint(struct journal_head *);
> void __jbd2_journal_insert_checkpoint(struct journal_head *, transaction_t *);
>
> +
> +/*
> + * Triggers
> + */
> +
> +struct jbd2_buffer_trigger_type {
> + /*
> + * Fired just before a buffer is written to the journal.
> + * mapped_data is a mapped buffer that is the frozen data for
> + * commit.
> + */
> + void (*t_commit)(struct jbd2_buffer_trigger_type *type,
> + struct buffer_head *bh, void *mapped_data,
> + size_t size);
> +
> + /*
> + * Fired during journal abort for dirty buffers that will not be
> + * committed.
> + */
> + void (*t_abort)(struct jbd2_buffer_trigger_type *type,
> + struct buffer_head *bh);
> +};
> +
> +extern void jbd2_buffer_commit_trigger(struct journal_head *jh,
> + void *mapped_data,
> + struct jbd2_buffer_trigger_type *triggers);
> +extern void jbd2_buffer_abort_trigger(struct journal_head *jh,
> + struct jbd2_buffer_trigger_type *triggers);
> +
> /* Buffer IO */
> extern int
> jbd2_journal_write_metadata_buffer(transaction_t *transaction,
> @@ -1046,6 +1075,8 @@ extern int jbd2_journal_extend (handle_t *, int nblocks);
> extern int jbd2_journal_get_write_access(handle_t *, struct buffer_head *);
> extern int jbd2_journal_get_create_access (handle_t *, struct buffer_head *);
> extern int jbd2_journal_get_undo_access(handle_t *, struct buffer_head *);
> +void jbd2_journal_set_triggers(struct buffer_head *,
> + struct jbd2_buffer_trigger_type *type);
> extern int jbd2_journal_dirty_metadata (handle_t *, struct buffer_head *);
> extern void jbd2_journal_release_buffer (handle_t *, struct buffer_head *);
> extern int jbd2_journal_forget (handle_t *, struct buffer_head *);
> diff --git a/include/linux/journal-head.h b/include/linux/journal-head.h
> index bb70ebb..525aac3 100644
> --- a/include/linux/journal-head.h
> +++ b/include/linux/journal-head.h
> @@ -12,6 +12,8 @@
>
> typedef unsigned int tid_t; /* Unique transaction ID */
> typedef struct transaction_s transaction_t; /* Compound transaction type */
> +
> +
> struct buffer_head;
>
> struct journal_head {
> @@ -87,6 +89,12 @@ struct journal_head {
> * [j_list_lock]
> */
> struct journal_head *b_cpnext, *b_cpprev;
> +
> + /* Trigger type */
> + struct jbd2_buffer_trigger_type *b_triggers;
> +
> + /* Trigger type for the committing transaction's frozen data */
> + struct jbd2_buffer_trigger_type *b_frozen_triggers;
> };
>
> #endif /* JOURNAL_HEAD_H_INCLUDED */
> --
> 1.5.6
>
>
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel@xxxxxxxxxxxxxx
> http://oss.oracle.com/mailman/listinfo/ocfs2-devel
--
Mark Fasheh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/