xfs
[Top] [All Lists]

Re: xfs task blocked for more than 120 seconds

To: xfs@xxxxxxxxxxx
Subject: Re: xfs task blocked for more than 120 seconds
From: Sami Liedes <sami.liedes@xxxxxx>
Date: Tue, 31 Jan 2012 00:35:28 +0200
In-reply-to: <20120130010530.GI15102@dastard>
Mail-followup-to: xfs@xxxxxxxxxxx
References: <20120130002026.GG10174@xxxxxxxxx> <20120130010530.GI15102@dastard>
User-agent: Mutt/1.5.21 (2010-09-15)
On Mon, Jan 30, 2012 at 12:05:30PM +1100, Dave Chinner wrote:
> > * The computer is a Core i7 2600 3.4 GHz with 4 cores and HT
> >   (therefore shows as 8 cores) with 8 GiB main memory. AES-NI
> >   instructions are supported and disk crypto generally (with ext4)
> >   works at transparent speeds.
> 
> That's not to say that ext4 doesn't have long IO hold-offs - it just
> doesn't trigger the hang-check code.

Hmm, maybe. Yet 120 seconds of a blocking syscall somehow sounds quite
long to me. With ext3 I remember seeing those every now and then with
dm-crypt.

> It is definitely a possibility that dm-crypt is not keeping up with
> the IO that XFS is sending it and the way XFS blocks waiting for it
> to complete triggers the hang-check code. However, it is possible
> that XFS is stalling due to long IO completion latencies. Do the
> workloads actually complete, or does the system hang? Also, does the
> IO to the disk appear to stop for long periods, or is the disk 100%
> busy the whole time? If the disk goes idle, can you get a dump of
> the stalled processes via "echo w > /proc/sysrq-trigger" and post
> that?

The workloads do eventually complete. I tried the tar extraction again
but this time extracting the tar from a different disk and saw no such
warnings (and the time taken seems reasonable at 96 minutes).

The blocked syscalls during BackupPC backupping seems weirder to me. I
don't think the ext4 partition was even mounted at that point, and if
it was, there certainly was no activity, i.e. the XFS partition was
the only partition on that disk that saw any I/O. I'll see if I can
figure out some way to repeat that and to figure out if the disk goes
idle.

        Sami

<Prev in Thread] Current Thread [Next in Thread>