On Sun, Oct 28, 2012 at 01:20:50PM +0100, Milan Holzäpfel wrote:
> On Fri, 26 Oct 2012 14:01:48 -0500
> Eric Sandeen <sandeen@xxxxxxxxxxx> wrote:
> > On 10/26/12 12:15 PM, Milan Holzäpfel wrote:
> > > Hello all,
> > >
> > > I have an XFS filesystem of size 1.2 TiB with 101 GiB free space and 14
> > > million inodes in use.
> Meanwhile, I deleted 200 GiB of data on that filesystem, with 9.9
> million inodes still in use. Now, quotacheck just works.
IOWs, the problem is load related.
> XFS (dm-3): Mounting Filesystem
> XFS (dm-3): Ending clean mount
> XFS (dm-3): Quotacheck needed: Please wait.
> INFO: task mount:8806 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> mount D ffffffff8180cba0 0 8806 8703 0x00000000
> ffff880036be38a8 0000000000000086 ffff880036be3878 ffffffffa042b4e9
> ffff880036be3fd8 ffff880036be3fd8 ffff880036be3fd8 0000000000013980
> ffffffff81c13440 ffff88007908dc00 ffff880036be3898 7fffffffffffffff
> Call Trace:
> [<ffffffffa042b4e9>] ? xfs_buf_iowait+0xa9/0x100 [xfs]
> [<ffffffff81698f59>] schedule+0x29/0x70
> [<ffffffff81697675>] schedule_timeout+0x2a5/0x320
> [<ffffffffa0486c75>] ? xfs_trans_read_buf+0x265/0x480 [xfs]
> [<ffffffffa0459ae7>] ? xfs_btree_check_sblock+0xc7/0x130 [xfs]
> [<ffffffff81698daf>] wait_for_common+0xdf/0x180
> [<ffffffff8108a280>] ? try_to_wake_up+0x200/0x200
> [<ffffffff81698f2d>] wait_for_completion+0x1d/0x20
> [<ffffffffa048c8a4>] xfs_qm_flush_one+0x74/0xb0 [xfs]
It's waiting for a write IO to complete - it seems unlikely that XFS
is the cause here because it's waiting on the storage to complete an
> Here is some more information on the system:
> Linux bombax 3.5.7-030507-generic #201210130556 SMP Sat Oct 13 09:57:36 UTC
> 2012 x86_64 x86_64 x86_64 GNU/Linux
> xfs_repair version 3.1.7
> 2 CPUs
> Storage layers are:
> mdadm RAID-5 256 KiB chunk size on sd[abcd]8
> Block-device encryption with cryptsetup-luks
> XFS file system with the quotacheck problem
... and that is an unusual configuration and says to me that the
storage under XFS is the likely problem....
> disks: 4x SATA, 3.0 Gbps, NCQ enabled
> hdparm -W says: "write-caching = 1 (on)" on all drives
> no battery-backed write cache
And slow SATA drives will not improve the situation, either. The
software RAID with small random writes that quotacheck does will
cause lots of RMW cycles and hence be very slow. This, alone, can
trigger hung task warnings. When you add encryption to the stack,
the storage stack will be even slower.
If you can reproduce it, I'd be really interested to know what the
sysrq-w output shows, as it will probably indicate a dm-crypt or md
thread hung waiting for something else to occur....