xfs
[Top] [All Lists]

deadlocks on ENOSPC

To: linux-xfs@xxxxxxxxxxx
Subject: deadlocks on ENOSPC
From: ASANO Masahiro <masano@xxxxxxxxxxxxxx>
Date: Fri, 15 Jul 2005 15:07:17 +0900 (JST)
Sender: linux-xfs-bounce@xxxxxxxxxxx
Hi,

I've been investigating a deadlock problem on a ENOSPC device.
The phenomenon is repeatable with the following method:

  1.  Make some files to fill a XFS filesystem leaving 80MB.
  2.  Execute dd.sh, which spawn 10 `dd's.  Each dd writes 16MB, so
      total is 160MB against 80MB free.

8<------8<------ dd.sh
#!/bin/sh
for i in `seq 10`
do
( while :; do dd if=/dev/zero of=F$i bs=1024 count=16384 > /dev/null 2>&1; done 
) &
done
8<------8<------ dd.sh

  3.  Wait a minutes, then two (or more) processes will be deadlocked
      with `D' state.  Its WCHAN is `text.l'.

I tested on HT Pentium4 box with Linux-2.6.13-rc[123] + TAKE 938502.
But I guess older version also have the same flaw.

Here is kernel back trace.

    ADDR S   PID  SESS   UID  EUID       MM NAME             FLAGS
df112530 U  1376     0     0     0        0 xfssyncd         forknoexec fstrans 
randomize
 ded6b588  c03ee853  schedule+6f3 ()
[ded6b5fc] c03edf65  __down+75 (decef93c,decef93c,ded6b654)
[ded6b634] c03ee0f2  __down_failed+a ()
[ded6b644] c02884de  [.text.lock.xfs_buf+1f]
[ded6b644] c0287034  pagebuf_lock+34 (d3215abc,14005,de2e11fc,0)
[ded6b658] c0286811  _pagebuf_find+161 (df6a0280,4841ad1,0,200)
[ded6b690] c02868ff  xfs_buf_get_flags+6f (df6a0280,4841ad1,0,1)
[ded6b6c4] c0286a22  xfs_buf_read_flags+32 (df6a0280,4841ad1,0,1)
[ded6b6e8] c0277e31  xfs_trans_read_buf+211 (dedde400,c9d74730,df6a0280,4841ad1)
[ded6b718] c0223e03  xfs_alloc_read_agf+a3 (dedde400,c9d74730,a,0)
[ded6b75c] c0223a39  xfs_alloc_fix_freelist+449 (ded6b97c,0,0,0)
[ded6b804] c0224285  xfs_alloc_vextent+345 (ded6b97c,ded6b8f0,0,ae71d5)
[ded6b868] c02346ba  xfs_bmap_alloc+15ca (ded6bb34,ded6baf4,0,0)
[ded6b9dc] c02389ef  xfs_bmapi+d1f (c9d74730,d0d64d20,7f1,0)
[ded6bb84] c0264d54  xfs_iomap_write_allocate+2b4 (d0d64d20,7f1000,0,1000)
[ded6bc74] c02639f0  xfs_iomap+460 (d0d64dfc,7f1000,0,1000)
[ded6bd00] c028d9d1  xfs_bmap+41 (d0d64d40,7f1000,0,1000)
[ded6bd24] c02843af  xfs_map_blocks+4f (d3c2204c,7f1000,0,1000)
[ded6bd58] c0285580  xfs_page_state_convert+510 (d3c2204c,c111d3e0,ded6bf44,1)
[ded6be24] c0285d2f  linvfs_writepage+6f (c111d3e0,ded6bf44,ded6be94,0)
[ded6be58] c018e94e  mpage_writepages+24e (d3c220f8,ded6bf44,0,ded6bf80)
[ded6bef4] c014cc92  do_writepages+42 
(d3c220f8,ded6bf44,0,0,0,fe6,0,0,0,0,0,0,ded6bf88,ffffffff,0,0,0,fe6,0,0,0,0,0,0,ded6bf88,28852)
[ded6bf08] c01459ef  __filemap_fdatawrite_range+9f ()

    ADDR S   PID  SESS   UID  EUID       MM NAME             FLAGS
dd3e3530 U 13387     0   524   524 cf073800 dd               fstrans randomize
 cf511950  c03ee853  schedule+6f3 ()
[cf5119c4] c03edf65  __down+75 (decefa2c,decefa2c,cf511a1c)
[cf5119fc] c03ee0f2  __down_failed+a ()
[cf511a0c] c02884de  [.text.lock.xfs_buf+1f]
[cf511a0c] c0287034  pagebuf_lock+34 (d321557c,c16e2800,cf510000,0)
[cf511a20] c0286811  _pagebuf_find+161 (df6a0280,6c62839,0,200)
[cf511a58] c02868ff  xfs_buf_get_flags+6f (df6a0280,6c62839,0,1)
[cf511a8c] c0286a22  xfs_buf_read_flags+32 (df6a0280,6c62839,0,1)
[cf511ab0] c0277e31  xfs_trans_read_buf+211 (dedde400,ce19dad0,df6a0280,6c62839)
[cf511ae0] c0223e03  xfs_alloc_read_agf+a3 (dedde400,ce19dad0,f,0)
[cf511b24] c0223a39  xfs_alloc_fix_freelist+449 (cf511bf0,0,a,e730a)
[cf511bcc] c0224549  xfs_free_extent+99 (ce19dad0,fd3ac4,0,60)
[cf511c50] c0237225  xfs_bmap_finish+185 (cf511d84,cf511cf0,ffffffff,ffffffff)
[cf511c8c] c025ffdf  xfs_itruncate_finish+29f (cf511d84,d0d64bb0,0,0)
[cf511d10] c027d53b  xfs_setattr+f5b (d0d64bd0,cf511dbc,0,0)
[cf511da0] c028be8d  linvfs_setattr+fd (ce66c5c8,cf511e7c,dedde418,cf511e68)
[cf511e3c] c0184a1c  notify_change+3cc (ce66c5c8,cf511e7c,48,0)
[cf511e70] c0164e62  do_truncate+42 (ce66c5c8,0,0,ce66c5c8)
[cf511ec4] c0177b0f  may_open+24f (cf511f44,2,8242,c0167e6a)
[cf511ee8] c01780a6  open_namei+526 (d26a2000,8242,1b6,cf511f44)
[cf511f30] c0165f7a  filp_open+3a (d26a2000,8241,1b6,d25bc880)
[cf511f8c] c0166389  sys_open+59 (bff419a6,8241,1b6,8241)

# xfs_info /opt
meta-data=/opt                   isize=256    agcount=16, agsize=947081 blks
         =                       sectsz=512  
data     =                       bsize=4096   blocks=15153296, imaxpct=25
         =                       sunit=0      swidth=0 blks, unwritten=1
naming   =version 2              bsize=4096  
log      =internal               bsize=4096   blocks=7399, version=1
         =                       sectsz=512   sunit=0 blks
realtime =none                   extsz=65536  blocks=0, rtextents=0
# df /opt
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/hda6             60583588  60583584         4 100% /opt


After some investigation, I've found in this case:

  xfssyncd: allocating extents; locking AG#15 AGF, waiting AG#10 AGF.
            Because XFS could not allocate all of the delayed blocks
            in a single AG.

        dd: freeing extents;    locking AG#10 AGF, waiting AGF15 AGF.
            Because the file is made from multiple AGs and XFS defines
            XFS_ITRUNC_MAX_EXTENTS as 2.

Both processes are in a transaction region (PF_FSTRANS) and operating
2 AGs.  It looks like AB-BA deadlock.

So, I have a question.  Is multiple AGs in a single transaction safe? 

IMHO, multiple AGs in a single transaction is easy to be deadlocked,
because XFS must keep the xfs_buf busy(semaphore down) until it is
committed to in-core log.

--
masano


<Prev in Thread] Current Thread [Next in Thread>