On 06/11/12 15:45, Ben Myers wrote:
That sounds pretty good. In particular, I think that making the start
and stop of the workqueues correct should be the high priority. I'm not
as concerned about the accuracy of the names, or cleaning up xfs_sync.c
and xfs_iget.c, but cleanups are worth doing too.
I hit a crash related to the xfslogd workqueue awhile back. Mark has
taken it up, so there might be a little coordination to do with him.
To not leave a teaser out there:
PID: 25879 TASK: ffff88012ac20340 CPU: 3 COMMAND: "kworker/3:3"
#0 [ffff8801a72af920] machine_kexec at ffffffff810244e9
#1 [ffff8801a72af990] crash_kexec at ffffffff8108d053
#2 [ffff8801a72afa60] oops_end at ffffffff813ad1b8
#3 [ffff8801a72afa90] no_context at ffffffff8102bd48
#4 [ffff8801a72afae0] __bad_area_nosemaphore at ffffffff8102c04d
#5 [ffff8801a72afb30] bad_area_nosemaphore at ffffffff8102c12e
#6 [ffff8801a72afb40] do_page_fault at ffffffff813afaee
#7 [ffff8801a72afc50] page_fault at ffffffff813ac635
[exception RIP: xlog_assign_tail_lsn_locked+72]
RIP: ffffffffa040da68 RSP: ffff8801a72afd00 RFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: dead000000200200
RDX: ffff88013b32d550 RSI: dead000000100100 RDI: ffff88013b32d550
RBP: ffff8801a72afd10 R8: ffff8801a72ae000 R9: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88013b32d568
R13: 0000000000000001 R14: ffff8801a72afd90 R15: ffff88013b32d540
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#8 [ffff8801a72afd18] xfs_trans_ail_delete_bulk at ffffffffa0414b2a [xfs]
#9 [ffff8801a72afd78] xfs_buf_iodone at ffffffffa04119c7 [xfs]
#10 [ffff8801a72afdb8] xfs_buf_do_callbacks at ffffffffa041166c [xfs]
#11 [ffff8801a72afdd8] xfs_buf_iodone_callbacks at ffffffffa04117de [xfs]
#12 [ffff8801a72afdf8] xfs_buf_iodone_work at ffffffffa03ad7e1 [xfs]
#13 [ffff8801a72afe18] process_one_work at ffffffff8104c53b
#14 [ffff8801a72afe68] worker_thread at ffffffff8104f0e3
#15 [ffff8801a72afee8] kthread at ffffffff8105395e
#16 [ffff8801a72aff48] kernel_thread_helper at ffffffff813b3ae4
I am just digging through that crash. It appears that xfs_umountfs() did
a good job in cleaning the AIL and the m_ddev_targp, but it needs to
wait for the xfslogd to be finished before deallocating the log.
Since workqueues are cheap, maybe it would be smart to have a
per-filesystem xfslogd too.