When a node fails, its dirty areas get special treatment from other nodes
using the area_resyncing() function. Should the suspend_list be created
before any reads or writes from the file system are processed by md? It
seems to me that gfs journal recovery could read/write to dirty regions
(from the failed node) before md was finished setting up the suspend_list.
md could probably prevent that by using the recover_prep() dlm callback to
set a flag that would block any i/o that arrived before the suspend_list
was ready.
.