xfs
[Top] [All Lists]

Re: long hangs when deleting large directories (3.0-rc3)

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: long hangs when deleting large directories (3.0-rc3)
From: Markus Trippelsdorf <markus@xxxxxxxxxxxxxxx>
Date: Mon, 20 Jun 2011 13:13:59 +0200
Cc: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=simple; d=mail.ud10.udmedia.de; h= date:from:to:cc:subject:message-id:references:mime-version: content-type:content-transfer-encoding:in-reply-to; q=dns/txt; s= beta; bh=Ofjz8MeTIfKgMWSKK7tvQ9BF/w5kt6kYR7lQb3ZbrqI=; b=K8At5j+ SjhDG3DboinK58ZkTTDUprDXjhJ+XeZsJB80kC+tfo5crHYtMzUfncvsPegH2vq5 +PBrFD3HKvzvgqr1NWyxmFzeU3nftTNSf0paaY/sO8xNwkNhhhlebU4R5ewiZHSZ 9h84YWXw441nBwjFE2EzB3j1kGrwv9NCJAZo=
In-reply-to: <20110620060351.GC1730@xxxxxxxxxxxxxx>
References: <20110618141950.GA1685@xxxxxxxxxxxxxx> <20110619222447.GI561@dastard> <20110620005415.GA1730@xxxxxxxxxxxxxx> <20110620013449.GO561@dastard> <20110620020236.GB1730@xxxxxxxxxxxxxx> <20110620023625.GP561@dastard> <20110620060351.GC1730@xxxxxxxxxxxxxx>
On 2011.06.20 at 08:03 +0200, Markus Trippelsdorf wrote:
> On 2011.06.20 at 12:36 +1000, Dave Chinner wrote:
> > On Mon, Jun 20, 2011 at 04:02:36AM +0200, Markus Trippelsdorf wrote:
> > > On 2011.06.20 at 11:34 +1000, Dave Chinner wrote:
> > > > On Mon, Jun 20, 2011 at 02:54:15AM +0200, Markus Trippelsdorf wrote:
> > > > > On 2011.06.20 at 08:24 +1000, Dave Chinner wrote:
> > > > > > On Sat, Jun 18, 2011 at 04:19:50PM +0200, Markus Trippelsdorf wrote:
> > > > > > > Running the latest git kernel (3.0-rc3) my machine hangs for long
> > > > > > > periods (1-2 sec) whenever I delete a large directory recursively 
> > > > > > > on my
> > > > > > > xfs partition. During the hang I cannot move the mouse pointer or 
> > > > > > > use
> > > > > > > the keyboard (but the music keeps playing without stuttering). A 
> > > > > > > quick
> > > > > > > way to reproduce is to "rm -fr" a kernel tree. 
> > > > > > 
> > > > > > So what is the system doing when it "hangs"? Is it CPU bound (e.g.
> > > > > > cpu scheduler issue)? Is the system running out of memory and
> > > > > > stalling everything in memory reclaim? What IO is occurring?
> > > > > 
> > > > > It's totally idle otherwise; just a desktop with a single xterm. The
> > > > > machine has four cores (and also runs with "CONFIG_PREEMPT=y"), so I
> > > > > don't think it is CPU bound at all. It has 8GB of memory (and the
> > > > > "hangs" even occur after reboot when most of it is free). No other IO
> > > > > activity is occurring.
> > > > 
> > > > Sure, the system might be otherwise idle, but what I was asking is
> > > > what load does the "rm -rf" cause. What IO does it cause? is it cpu
> > > > bound? etc.
> > > 
> > > I have not measured this, so I cannot tell.
> > 
> > And so you are speculating as to the cause of the problem. What I'm
> > trying to do is work from the bottom up to ensure that the layers
> > below the fs are not the cause of the problem.
> > 
> > > > > > Is your partition correctly sector aligned for however your drive
> > > > > > maps it's 4k sectors?
> > > > > 
> > > > > Yes, it's a GPT partition that is aligned to 1MB.
> > > > 
> > > > Ok, that is fine, but the big question now is how does the drive
> > > > align sector 0? Is that 4k aligned, or is it one of those drives
> > > > that aligns an odd 512 byte logical sector to the physical 4k sector
> > > > boundary (i.e. sector 63 is 4k aligned to work with msdos
> > > > partitions). FYI, some drives have jumpers on them to change this
> > > > odd/even sector alignment configuration.....
> > > 
> > > No, it's none of those (it's a Seagate Barracuda Green ST1500). Sector 0
> > > is 4k aligned for sure. The odd 512 byte offset was present only on some
> > > first generation drives. 
> > > But I think the whole alignment issue is a red herring, because I cannot
> > > reproduce the "hangs" on the next partition on the same drive. This
> > > partition is larger and contains my music and film collection (so mostly
> > > static content and no traffic).
> > 
> > Which also means you might have one unaligned and one aligned
> > partition.  i.e. the test results you have presented does not
> > necessarily point at a filesystem problem. We always ask for exact
> > details of your storage subsystem for these reasons - so we can
> > understand if there's something that you missed or didn't think was
> > important enough to tell us. You may have already checked those
> > things, but we don't know that if you don't tell us....
> 
> Understood.
> 
> > So, is the sector alignment of the second partition the same as the
> > first partition?
> 
> Yes.
> 
> > > And as I wrote in my other reply to this
> > > thread: »it appears that the observed "hangs" are the result of a
> > > strongly aged file-system.«
> > 
> > There is no evidence that points to any cause. Hell, I don't even
> > know what you consider a "strongly aged filesystem" looks like....
> > 
> > If the alignment is the cause of the problem, you should be able to
> > see a difference in performance when doing random 4k synchronous
> > writes to a large file on differently aligned partitions. Can you
> > run the same random 4k sync write test on both partitions (make sure
> > barriers are enabled) and determine if they perform the same?
> > 
> > If the filesystem layout is the cause of the problem, you should be
> > able to take a metadump of the problematic filesystem, restore it to
> > a normal 512 sector drive and reproduce the "rm -rf" problem. Can
> > you try this as well?
> 
> OK. I was able to reproduce the same hang on a conventional 512 sector drive.
> The partition that I've used was the predecessor to the one on the 4k drive. 
> So
> it saw roughly the same usage pattern.
> This is the output of "dstat -cdyl -C 0,1,2,3 -D sdc --disk-tps" during the
> hang:
> 
> -------cpu0-usage--------------cpu1-usage--------------cpu2-usage--------------cpu3-usage------
>  --dsk/sdc-- ---system-- ---load-avg--- --dsk/sdc--
> usr sys idl wai hiq siq:usr sys idl wai hiq siq:usr sys idl wai hiq siq:usr 
> sys idl wai hiq siq| read  writ| int   csw | 1m   5m  15m |reads writs
> 
>   0   0 100   0   0   0:  1   0  99   0   0   0:  0   1  99   0   0   0:  1   
> 0  99   0   0   0|   0     0 | 249   354 |0.33 0.58 0.38|   0     0 
>   0   0 100   0   0   0:  0   0 100   0   0   0:  0   0 100   0   0   0:  0   
> 0 100   0   0   0|   0     0 | 244   228 |0.33 0.58 0.38|   0     0 
>   1   2  97   0   0   0:  0   1  99   0   0   0:  0   1  99   0   0   0:  0   
> 1  99   0   0   0|   0     0 | 559   614 |0.33 0.58 0.38|   0     0 
>   0   0 100   0   0   0:  1   0  99   0   0   0:  1   0  99   0   0   0:  1   
> 0  99   0   0   0|   0     0 | 341   426 |0.33 0.58 0.38|   0     0 
>   1   0  99   0   0   0:  1   4  95   0   0   0:  0   1  99   0   0   0:  1  
> 16  83   0   0   0|   0     0 | 874   796 |0.33 0.58 0.38|   0     0 
>   2  50  49   0   0   0:  1   9  90   0   0   0:  1   9  90   0   0   0:  1  
> 23  76   0   0   0|   0  6400k|2803  2073 |0.46 0.60 0.39|   0    25 
>   1  29  70   0   0   0:  1   1  98   0   0   0:  1   9  90   0   0   0:  1  
> 53  46   0   0   0|   0  6400k|2047  1414 |0.46 0.60 0.39|   0    25 
>   0   4  96   0   0   0:  0   0 100   0   0   0:  1  19  80   0   0   0:  0  
> 80  20   0   0   0|   0  2048k|1425   685 |0.46 0.60 0.39|   0     8 
>   2   1  97   0   0   0:  1   6  93   0   0   0:  0   5  95   0   0   0:  0  
> 83  17   0   0   0|   0  4608k|1624   849 |0.46 0.60 0.39|   0    18 
>   2  45  53   0   0   0:  1  16  83   0   0   0:  3  20  77   0   0   0:  1  
> 15  84   0   0   0|   0  6400k|2420  1984 |0.46 0.60 0.39|   0    26 
>   1  19  80   0   0   0:  2   8  90   0   0   0:  0  33  67   0   0   0:  0  
> 33  67   0   0   0|   0  6400k|2694  2134 |0.59 0.63 0.40|   0    25 
>   2   7  91   0   0   0:  2   1  97   0   0   0:  1   0  99   0   0   0:  0  
> 49  10  41   0   0|   0  8269k|1865  1571 |0.59 0.63 0.40|   0   363 
>   1   1  98   0   0   0:  1   1  98   0   0   0:  1   1  98   0   0   0:  0   
> 1   0  99   0   0|   0  4778k|1509  1639 |0.59 0.63 0.40|   0   410 
>   2   0  98   0   0   0:  2   1  97   0   0   0:  1   1  98   0   0   0:  2   
> 0   0  98   0   0|   0  5318k|1663  1809 |0.59 0.63 0.40|   0   426 
>   1   1  98   0   0   0:  2   7  91   0   0   0:  1   0  99   0   0   0:  1   
> 0   0  99   0   0|   0  5446k|1659  1806 |0.59 0.63 0.40|   0   432 
>   0   1  99   0   0   0:  1   0  99   0   0   0:  2   0  98   0   0   0:  0   
> 1  17  82   0   0|   0  5472k|1572  1837 |0.62 0.63 0.40|   0   439 
>   2   0  98   0   0   0:  2   2  96   0   0   0:  0   1  99   0   0   0:  0   
> 1  99   0   0   0|   0   397k|1058  1049 |0.62 0.63 0.40|   0    36 
>   1   1  98   0   0   0:  1   1  98   0   0   0:  1   1  98   0   0   0:  0   
> 0 100   0   0   0|   0     0 | 617   689 |0.62 0.63 0.40|   0     0 
>   9   4  87   0   0   0:  4   0  96   0   0   0:  1   1  98   0   0   0:  8   
> 6  87   0   0   0|   0     0 |1234  1961 |0.62 0.63 0.40|   0     0 
>   0   1  99   0   0   0:  1   1  98   0   0   0:  0   1  99   0   0   0:  0   
> 1  99   0   0   0|   0     0 | 391   403 |0.62 0.63 0.40|   0     0 
>   1   0  99   0   0   0:  1   1  98   0   0   0:  0   0 100   0   0   0:  0   
> 0 100   0   0   0|   0     0 | 366   375 |0.57 0.62 0.40|   0     0 
> 

Here are two more examples. The time when the hang occurs is marked with
"=>":
------cpu0-usage--------------cpu1-usage--------------cpu2-usage--------------cpu3-usage------
 --dsk/sdb-- ---system-- ---load-avg--- --dsk/sdb--
usr sys idl wai hiq siq:usr sys idl wai hiq siq:usr sys idl wai hiq siq:usr sys 
idl wai hiq siq| read  writ| int   csw | 1m   5m  15m |reads writs
  0   0 100   0   0   0:  1   1  98   0   0   0:  1   0  99   0   0   0:  0   1 
 99   0   0   0|   0     0 | 411   498 |1.14 0.58 0.27|   0     0 
  1   1  98   0   0   0:  0   1  99   0   0   0:  0   1  99   0   0   0:  0   0 
100   0   0   0|   0     0 | 580   649 |1.05 0.57 0.27|   0     0 
  0   1  99   0   0   0:  0   0 100   0   0   0:  0   0 100   0   0   0:  1   0 
 99   0   0   0|   0     0 | 297   353 |1.05 0.57 0.27|   0     0 
  1   0  99   0   0   0:  0   1  99   0   0   0:  0   1  99   0   0   0:  0   1 
 99   0   0   0|   0     0 | 322   393 |1.05 0.57 0.27|   0     0 
  0   0 100   0   0   0:  1   0  99   0   0   0:  1   0  99   0   0   0:  0   1 
 99   0   0   0|   0     0 | 301   382 |1.05 0.57 0.27|   0     0 
  1  16  83   0   0   0:  1  21  78   0   0   0:  0  13  87   0   0   0:  1  13 
 87   0   0   0|   0  2016k|2169  1495 |1.05 0.57 0.27|   0    57 
  1  10  89   0   0   0:  2  38  61   0   0   0:  1   8  81  10   0   0:  0  64 
 36   0   0   0|   0  4476k|2378  2010 |1.12 0.59 0.28|   0   146 
=>0   0 100   0   0   0:  0   1  99   0   0   0:  0   1   0  99   0   0:  0 100 
  0   0   0   0|   0     0 |1237   283 |1.12 0.59 0.28|   0     0 
  0  27  73   0   0   0:  1   0  99   0   0   0:  0   0  47  53   0   0:  0  56 
 44   0   0   0|   0  5812k|1596   857 |1.12 0.59 0.28|   0   182 
  1   5  35  59   0   0:  2   0  85  13   0   0:  1   0  99   0   0   0:  0   7 
 28  65   0   0|   0  6263k|1789  1835 |1.12 0.59 0.28|   0   555 
  1   0  99   0   0   0:  0   1  99   0   0   0:  1   0  99   0   0   0:  0   4 
  6  90   0   0|   0  2664k|1074  1155 |1.12 0.59 0.28|   0   286 
  4   1  95   0   0   0:  4   2  94   0   0   0:  2   1  97   0   0   0:  6   4 
 91   0   0   0|   0  1189k|1628  2106 |1.11 0.60 0.28|   0   137 
  1   0  99   0   0   0:  0   1  99   0   0   0:  0   1  99   0   0   0:  0   1 
 99   0   0   0|   0  1024B| 491   438 |1.11 0.60 0.28|   0     1 
  1   1  98   0   0   0:  2   0  98   0   0   0:  1   0  99   0   0   0:  0   0 
100   0   0   0|   0     0 | 583   524 |1.11 0.60 0.28|   0     0 
  5   1  94   0   0   0:  1   1  98   0   0   0:  0   1  99   0   0   0:  7   7 
 87   0   0   0|   0     0 |1148  1823 |1.11 0.60 0.28|   0     0

  5   3  92   0   0   0:  4   1  95   0   0   0:  2   0  98   0   0   0:  5   2 
 93   0   0   0|   0     0 |1374  1678 |0.70 0.65 0.40|   0     0 
  2   2  96   0   0   0:  1   1  98   0   0   0:  0   0 100   0   0   0:  1   0 
 99   0   0   0|   0     0 | 913   635 |0.70 0.65 0.40|   0     0 
  0   3  97   0   0   0:  1   0  99   0   0   0:  1   1  98   0   0   0:  1   5 
 94   0   0   0|   0   512B| 898   697 |0.70 0.65 0.40|   0     1 
  7  41  52   0   0   0:  2  11  87   0   0   0:  5   6  89   0   0   0:  3   1 
 96   0   0   0|   0  5856k|2771  2751 |0.70 0.65 0.40|   0   177 
  7  26  67   0   0   0:  4   1  95   0   0   0:  1  37  62   0   0   0:  3  38 
 59   0   0   0|   0   864k|2904  2806 |0.70 0.65 0.40|   0    33 
=>2   1  97   0   0   0:  1   0  99   0   0   0:  0   1  99   0   0   0:  0 100 
  0   0   0   0|   0     0 |1725   639 |0.89 0.69 0.41|   0     0 
  1   1  98   0   0   0:  2   1  97   0   0   0:  2   1  97   0   0   0:  0  71 
 29   0   0   0|   0  3968k|1793   852 |0.89 0.69 0.41|   0   123 
  4   1  95   0   0   0:  3  48  49   0   0   0:  4  18  78   0   0   0:  5   6 
 89   0   0   0|   0  5120k|2920  3050 |0.89 0.69 0.41|   0   159 
  3  54  43   0   0   0:  1   8  91   0   0   0:  1   4  95   0   0   0:  1   5 
 94   0   0   0|4096B 4160k|2705  1936 |0.89 0.69 0.41|   1   132 
  1  40  59   0   0   0:  1   3  96   0   0   0:  1  13  86   0   0   0:  1   1 
 98   0   0   0|   0  5984k|2312  1568 |0.89 0.69 0.41|   0   187 
  1  23  76   0   0   0:  0   5  95   0   0   0:  1  16  83   0   0   0:  1  13 
 86   0   0   0|   0  6976k|2310  1751 |0.90 0.70 0.41|   0   213 
  0   1  31  68   0   0:  0  20  80   0   0   0:  0  19  36  45   0   0:  1  22 
 76   0   0   1|   0  8645k|2364  1773 |0.90 0.70 0.41|   0   578 
  1   0  98   1   0   0:  1   3  96   0   0   0:  1   1  28  70   0   0:  1   3 
 96   0   0   0|   0  4506k|1520  1358 |0.90 0.70 0.41|   0   374 
  2   2  96   0   0   0:  2   1  97   0   0   0:  0   0 100   0   0   0:  0   6 
 94   0   0   0|   0   504k| 877   726 |0.90 0.70 0.41|   0    92 
  1   1  98   0   0   0:  1   1  98   0   0   0:  1   0  99   0   0   0:  2   0 
 98   0   0   0|   0    78k| 907   753 |0.90 0.70 0.41|   0    17 
  1   3  96   0   0   0:  2   1  97   0   0   0:  0   1  99   0   0   0:  1   1 
 98   0   0   0|   0     0 | 894   710 |0.83 0.69 0.41|   0     0 
  2   0  98   0   0   0:  2   0  98   0   0   0:  1   0  99   0   0   0:  1   0 
 99   0   0   0|   0     0 | 901   682 |0.83 0.69 0.41|   0     0 

-- 
Markus

<Prev in Thread] Current Thread [Next in Thread>