xfs
[Top] [All Lists]

Re: [PATCH] repair: avoid ABBA deadlocks on prefetched buffers

To: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Subject: Re: [PATCH] repair: avoid ABBA deadlocks on prefetched buffers
From: Arkadiusz Miśkiewicz <arekm@xxxxxxxx>
Date: Fri, 18 Nov 2011 09:44:09 +0100
Cc: xfs@xxxxxxxxxxx
In-reply-to: <20111115210953.GA6670@xxxxxxxxxxxxx>
References: <20111115210953.GA6670@xxxxxxxxxxxxx>
User-agent: KMail/1.13.7 (Linux/3.2.0-rc1-00306-g7f80850-dirty; KDE/4.7.3; x86_64; ; )
On Tuesday 15 of November 2011, Christoph Hellwig wrote:
> Both the prefetch threads and actual repair processing threads can have
> multiple buffers at a time locked, but they do no use a common locker
> order, which can lead to ABBA deadlocks while trying to lock the buffers.

There is still some issue with deadlocking.

The last printed messages:
błędna liczba magiczna 0x41425443 w bloku inobt 2/1438099
błędna liczba magiczna 0x41425443 w bloku inobt 2/1438196
błędna liczba magiczna 0x41425443 w bloku inobt 2/1438732
(invalid magic number ... in block inobt ...)


# gdb ./xfs_repair_tcmalloc `pidof xfs_repair_tcmalloc`
GNU gdb (GDB) 7.3.1-1 (PLD Linux)
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pld-linux".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /root/xfs_repair_tcmalloc...done.
Attaching to program: /root/xfs_repair_tcmalloc, process 21440
Reading symbols from /lib64/libuuid.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libuuid.so.1
Reading symbols from /lib64/librt.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/librt.so.1
Reading symbols from /lib64/libtcmalloc_minimal.so.0...(no debugging symbols 
found)...done.
Loaded symbols for /lib64/libtcmalloc_minimal.so.0
Reading symbols from /lib64/libpthread.so.0...(no debugging symbols 
found)...done.
[Thread debugging using libthread_db enabled]
[New Thread 0x7fdf93a73700 (LWP 21462)]
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /usr/lib64/libstdc++.so.6...(no debugging symbols 
found)...done.
Loaded symbols for /usr/lib64/libstdc++.so.6
Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols 
found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libgcc_s.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libgcc_s.so.1
0x00007fdf9c2c21bf in pthread_join () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00007fdf9c2c21bf in pthread_join () from /lib64/libpthread.so.0
#1  0x000000000042dd8f in destroy_work_queue (wq=0x7fff62659180) at 
threads.c:146
#2  0x000000000042d89f in scan_ags (mp=0x7fff62659300, scan_threads=<optimized 
out>) at scan.c:1353
#3  0x000000000041b68e in phase2 (mp=0x7fff62659300, scan_threads=32) at 
phase2.c:142
#4  0x0000000000402bd6 in main (argc=<optimized out>, argv=<optimized out>) at 
xfs_repair.c:747
(gdb) info threads
  Id   Target Id         Frame
  2    Thread 0x7fdf93a73700 (LWP 21462) "xfs_repair_tcma" 0x00007fdf9c2c78e4 
in __lll_lock_wait () from /lib64/libpthread.so.0
* 1    Thread 0x7fdf9cbba760 (LWP 21440) "xfs_repair_tcma" 0x00007fdf9c2c21bf 
in pthread_join () from /lib64/libpthread.so.0
(gdb) thread 2
[Switching to thread 2 (Thread 0x7fdf93a73700 (LWP 21462))]
#0  0x00007fdf9c2c78e4 in __lll_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00007fdf9c2c78e4 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007fdf9c2c31b5 in _L_lock_883 () from /lib64/libpthread.so.0
#2  0x00007fdf9c2c300a in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00000000004334ba in libxfs_getbuf_flags (device=<optimized out>, 
blkno=<optimized out>, len=<optimized out>, flags=<optimized out>) at rdwr.c:423
#4  0x000000000043370e in libxfs_readbuf (dev=65024, blkno=4294967344, len=8, 
flags=0) at rdwr.c:530
#5  0x000000000042b44f in scan_sbtree (root=8, nlevels=25160588, agno=2, 
suspect=1, func=0x42c5d0 <scanfunc_ino>, isroot=<optimized out>, priv=0x7143f0)
    at scan.c:90
#6  0x000000000042ccdd in scanfunc_ino (block=<optimized out>, level=25160588, 
bno=<optimized out>, agno=2, suspect=1, isroot=1, priv=0x7143f0)
    at scan.c:1037
#7  0x000000000042b476 in scan_sbtree (root=8, nlevels=25160589, agno=2, 
suspect=0, func=0x42c5d0 <scanfunc_ino>, isroot=<optimized out>, priv=0x7143f0)
    at scan.c:96
#8  0x000000000042c3a8 in validate_agi (agcnts=0x7143f0, agno=2, agi=0x783a00) 
at scan.c:1151
#9  scan_ag (wq=<optimized out>, agno=2, arg=0x7143f0) at scan.c:1293
#10 0x000000000042da4a in worker_thread (arg=0x7fff62659180) at threads.c:46
#11 0x00007fdf9c2c0ed5 in start_thread () from /lib64/libpthread.so.0
#12 0x00007fdf9ba7de5d in clone () from /lib64/libc.so.6
#13 0x0000000000000000 in ?? ()
(gdb)


-- 
Arkadiusz Miśkiewicz        PLD/Linux Team
arekm / maven.pl            http://ftp.pld-linux.org/

<Prev in Thread] Current Thread [Next in Thread>