xfs
[Top] [All Lists]

Re: Enabling quota on XFS filesystem with many files hangs

To: xfs@xxxxxxxxxxx
Subject: Re: Enabling quota on XFS filesystem with many files hangs
From: Milan Holzäpfel <listen@xxxxxxxx>
Date: Sun, 28 Oct 2012 13:22:25 +0100
In-reply-to: <508ADE1C.40208@xxxxxxxxxxx>
References: <20121026191540.ca9ee64db2a51e7166b7fadc@xxxxxxxx> <508ADE1C.40208@xxxxxxxxxxx>
On Fri, 26 Oct 2012 14:01:48 -0500
Eric Sandeen <sandeen@xxxxxxxxxxx> wrote:

> On 10/26/12 12:15 PM, Milan Holzäpfel wrote:
> > Hello all, 
> > 
> > I have an XFS filesystem of size 1.2 TiB with 101 GiB free space and 14
> > million inodes in use. 

Meanwhile, I deleted 200 GiB of data on that filesystem, with 9.9
million inodes still in use. Now, quotacheck just works. 

> > With 3.5.7 and 3.6.3, the OOM does not occur. 

Correction: I couldn't boot 3.6.3 because of a regression [1]. I don't
know whether the problem exists with 3.6.3. 

> > In dmesg, I find
> > 
> > INFO: task mount:8806 blocked for more than 120 seconds.
> 
> And then what?  Probably a backtrace, right?

Yes, of course. Sorry. Here it is:

Oct 24 15:23:39 bombax kernel: [  221.122875] XFS (dm-3): Mounting Filesystem
Oct 24 15:23:39 bombax kernel: [  221.431585] XFS (dm-3): Ending clean mount
Oct 24 15:23:39 bombax kernel: [  221.445026] XFS (dm-3): Quotacheck needed: 
Please wait.
Oct 24 15:28:01 bombax kernel: [  482.960045] INFO: task mount:8806 blocked for 
more than 120 seconds.
Oct 24 15:28:01 bombax kernel: [  482.966422] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 24 15:28:01 bombax kernel: [  482.974281] mount           D 
ffffffff8180cba0     0  8806   8703 0x00000000
Oct 24 15:28:01 bombax kernel: [  482.974286]  ffff880036be38a8 
0000000000000086 ffff880036be3878 ffffffffa042b4e9
Oct 24 15:28:01 bombax kernel: [  482.974290]  ffff880036be3fd8 
ffff880036be3fd8 ffff880036be3fd8 0000000000013980
Oct 24 15:28:01 bombax kernel: [  482.974293]  ffffffff81c13440 
ffff88007908dc00 ffff880036be3898 7fffffffffffffff
Oct 24 15:28:01 bombax kernel: [  482.974297] Call Trace:
Oct 24 15:28:01 bombax kernel: [  482.974338]  [<ffffffffa042b4e9>] ? 
xfs_buf_iowait+0xa9/0x100 [xfs]
Oct 24 15:28:01 bombax kernel: [  482.974344]  [<ffffffff81698f59>] 
schedule+0x29/0x70
Oct 24 15:28:01 bombax kernel: [  482.974347]  [<ffffffff81697675>] 
schedule_timeout+0x2a5/0x320
Oct 24 15:28:01 bombax kernel: [  482.974373]  [<ffffffffa0486c75>] ? 
xfs_trans_read_buf+0x265/0x480 [xfs]
Oct 24 15:28:01 bombax kernel: [  482.974395]  [<ffffffffa0459ae7>] ? 
xfs_btree_check_sblock+0xc7/0x130 [xfs]
Oct 24 15:28:01 bombax kernel: [  482.974398]  [<ffffffff81698daf>] 
wait_for_common+0xdf/0x180
Oct 24 15:28:01 bombax kernel: [  482.974403]  [<ffffffff8108a280>] ? 
try_to_wake_up+0x200/0x200
Oct 24 15:28:01 bombax kernel: [  482.974406]  [<ffffffff81698f2d>] 
wait_for_completion+0x1d/0x20
Oct 24 15:28:01 bombax kernel: [  482.974430]  [<ffffffffa048c8a4>] 
xfs_qm_flush_one+0x74/0xb0 [xfs]
Oct 24 15:28:01 bombax kernel: [  482.974455]  [<ffffffffa048c830>] ? 
xfs_qm_dqattach_grouphint+0x90/0x90 [xfs]
Oct 24 15:28:01 bombax kernel: [  482.974479]  [<ffffffffa048c3ae>] 
xfs_qm_dquot_walk.isra.5+0xde/0x160 [xfs]
Oct 24 15:28:01 bombax kernel: [  482.974505]  [<ffffffffa048e21c>] 
xfs_qm_quotacheck+0x2bc/0x2e0 [xfs]
Oct 24 15:28:01 bombax kernel: [  482.974529]  [<ffffffffa048e3f4>] 
xfs_qm_mount_quotas+0x124/0x1b0 [xfs]
Oct 24 15:28:01 bombax kernel: [  482.974554]  [<ffffffffa047b8e5>] 
xfs_mountfs+0x615/0x6b0 [xfs]
Oct 24 15:28:01 bombax kernel: [  482.974573]  [<ffffffffa043af7d>] 
xfs_fs_fill_super+0x21d/0x2b0 [xfs]
Oct 24 15:28:01 bombax kernel: [  482.974577]  [<ffffffff81189996>] 
mount_bdev+0x1c6/0x210
Oct 24 15:28:01 bombax kernel: [  482.974597]  [<ffffffffa043ad60>] ? 
xfs_parseargs+0xb80/0xb80 [xfs]
Oct 24 15:28:01 bombax kernel: [  482.974616]  [<ffffffffa0439025>] 
xfs_fs_mount+0x15/0x20 [xfs]
Oct 24 15:28:01 bombax kernel: [  482.974620]  [<ffffffff8118a7d3>] 
mount_fs+0x43/0x1b0
Oct 24 15:28:01 bombax kernel: [  482.974624]  [<ffffffff811a4ab6>] 
vfs_kern_mount+0x76/0x120
Oct 24 15:28:01 bombax kernel: [  482.974628]  [<ffffffff811a5424>] 
do_kern_mount+0x54/0x110
Oct 24 15:28:01 bombax kernel: [  482.974631]  [<ffffffff811a7114>] 
do_mount+0x1a4/0x260
Oct 24 15:28:01 bombax kernel: [  482.974634]  [<ffffffff811a75f0>] 
sys_mount+0x90/0xe0
Oct 24 15:28:01 bombax kernel: [  482.974637]  [<ffffffff816a26e9>] 
system_call_fastpath+0x16/0x1b

> sysrq-w to get hung tasks or sysrq-t to get all task traces might
> help.
> 
> The sysrqs are one of the things suggested in:
> http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F

I prepared all the information mentioned here, including the trace-cmd
report, but then I noticed that the problem had disappeared (see
above). 

If you are really interested in the information, I could try to move
the 200 GiB back on the filesystem and see whether the problem
reappears. 


Here is some more information on the system:

Linux bombax 3.5.7-030507-generic #201210130556 SMP Sat Oct 13 09:57:36 UTC 
2012 x86_64 x86_64 x86_64 GNU/Linux
xfs_repair version 3.1.7
2 CPUs

Storage layers are:
mdadm RAID-5 256 KiB chunk size on sd[abcd]8
Block-device encryption with cryptsetup-luks 
XFS file system with the quotacheck problem

(no LVM below the XFS file system. / is ext4 on LVM on mdadm RAID-1)

disks: 4x SATA, 3.0 Gbps, NCQ enabled
hdparm -W says: "write-caching =  1 (on)" on all drives
no battery-backed write cache

mount options: logbsize=256k
xfs_info: 

meta-data=/dev/mapper/r5a-decrypt isize=256    agcount=32, agsize=9827264 blks
         =                       sectsz=512   attr=2
data     =                       bsize=4096   blocks=314472448, imaxpct=5
         =                       sunit=64     swidth=192 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal               bsize=4096   blocks=153600, version=2
         =                       sectsz=512   sunit=64 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0


Regards,
Milan Holzäpfel

[1]: https://lkml.org/lkml/2012/10/11/155



-- 
Milan Holzäpfel <listen@xxxxxxxx>

<Prev in Thread] Current Thread [Next in Thread>