On Mon, Jun 23, 2014 at 05:48:31PM -0400, Michael L. Semon wrote:
> At the ACL limit of v5-superblock XFS--with a directory filled with both
> and access ACL entries--I'm getting a null pointer dereference on x86 after
> creating the directory successfully.
> Disclaimer: There's some current issues on 32-bit x86 that, for instance,
> make badblocks see phantom bad blocks on a read test. My apologies in
> if this turns out to be a false alarm bug report.
> My first encounter with this issue involved fsstress. Here's part of a
> session from the fsstress run.
Ok, I haven't been able to reproduce this on x86-64....
> # ### ran `fsstress -d $SCRATCH_MNT/test-dir -n 10000 -p 16`
> [ 1789.338622] BUG: unable to handle kernel NULL pointer dereference at
> [ 1789.338842] IP: [<c1263048>] xfs_ail_check+0x58/0xc0
Hmmm - xfs_ail_check()is
checking the LSN ordering of the items on the AIL, and it's crashed
trying to dereference one of the list pointers on the current log
> [ 1789.339042] [<c12630c3>] xfs_ail_delete+0x13/0x60
> [ 1789.339042] [<c1263d1d>] xfs_trans_ail_update_bulk+0xad/0x3c0
> [ 1789.339042] [<c11fbd35>] xfs_trans_committed_bulk+0x255/0x300
> [ 1789.339042] [<c125dcac>] xlog_cil_committed+0x3c/0x160
And given that it is doing an update, I suspect a problem with
the XFS_LI_IN_AIL flag - that the item is not of the AIL, but has
that flag set.
Can you enable the xfs_ail* tracepoints, set
/proc/sys/kernel/ftrace_dump_on_oops and rerun the test? That should
dump the trace buffer into the kernel dmesg output showing AIL
operations just before the crash occurs. That might tell us what has