xfs
[Top] [All Lists]

Re: File system remain unresponsive until the system is rebooted.

To: Linux fs XFS <xfs@xxxxxxxxxxx>
Subject: Re: File system remain unresponsive until the system is rebooted.
From: pg_xf2@xxxxxxxxxxxxxxxxxx (Peter Grandi)
Date: Mon, 30 Jan 2012 17:23:45 +0000
In-reply-to: <CANs4eSBWLc4HxAbPZ8kOVOdJ7RKiA+-ai3Q2J+FAyuzHtUqfdg@xxxxxxxxxxxxxx>
References: <CANs4eSBWLc4HxAbPZ8kOVOdJ7RKiA+-ai3Q2J+FAyuzHtUqfdg@xxxxxxxxxxxxxx>
> We are using RAID-0 volumes as PV's in our LVM stack and XFS
> as the filesystem.

LVM is in general a bad idea, and I have found that it
occasionally interacts not so well with XFS and other
filesystems under resource pressure.

It also seems from one of the backtraces that you are
complicating all this further by running under Xen,
perhaps on sparsely allocated virtual disks.

> [ ... ] The files system remained unresponsive until we
> rebooted the system and again increased the size of the
> filesystem. [ ... ]

Good luck. I know some people who also went the whole VM/LVM/XFS
stack way and had lots of problems. It is what I call the
"syntactic" approach: expecting that every syntactically valid
combination of features is going to work, and work well. Sure it
should :-).

Most of the hangs seem to happen during resource allocation,
and at least one is triggered by the flusher:

> Jan 26 03:05:47 ip-10-0-1-153 kernel: [241565.550853] [<ffffffff8111241b>] 
> bdi_writeback_task+0x4b/0xe0
> Jan 26 03:05:47 ip-10-0-1-153 kernel: [241565.550858] [<ffffffff810c72f0>] ? 
> bdi_start_fn+0x0/0x110
> Jan 26 03:05:47 ip-10-0-1-153 kernel: [241565.550861] [<ffffffff810c7371>] 
> bdi_start_fn+0x81/0x110
> Jan 26 03:05:47 ip-10-0-1-153 kernel: [241565.550863] [<ffffffff810c72f0>] ? 
> bdi_start_fn+0x0/0x110

It could be that there is intense pressure on kernel memory,
often due to excessively loose flusher parameters.

<Prev in Thread] Current Thread [Next in Thread>