xfs
[Top] [All Lists]

Re: writeout stalls in current -git

To: "David Chinner" <dgc@xxxxxxx>
Subject: Re: writeout stalls in current -git
From: "Torsten Kaiser" <just.for.lkml@xxxxxxxxxxxxxx>
Date: Sun, 4 Nov 2007 12:19:19 +0100
Cc: "Peter Zijlstra" <peterz@xxxxxxxxxxxxx>, "Fengguang Wu" <wfg@xxxxxxxxxxxxxxxx>, "Maxim Levitsky" <maximlevitsky@xxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx, "Andrew Morton" <akpm@xxxxxxxxxxxxxxxxxxxx>, linux-fsdevel@xxxxxxxxxxxxxxx, xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:references; bh=NZykAKLjtcuHzPXcDO65poQqxlxE3dMCg+zJZrvejS0=; b=rHYLZ8SaYwJqQoncetNIx4YSijxxofphfL00YRNTU6i8LdS+9uyBgii42Z3bKKVnU+HjrWVmlJNlh/y9QlQaiGMWg7DPQKBZUoPDvGN1i4z9m3uZnMkl9VyJ+4xg0Lyw6Ytv6geFqh101j+WUNj4by2zyop9UDyBYJIfeVP/SGM=
Domainkey-signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:references; b=BpWxF4kZt/GKotSn5NDhf1si6zuKIbTtlVRxenmCytnfcbkc6qTnpQsbJ441z90lqX/O5kgv671cKUG5JKECJOPEA0BEmGz/eOPnJKGtGkSvb4CGwjrkNlQ0fPSFeDPRJ+I72+0JaTCS1PtW6lvl3GjsJFBpF2P56ExPJLD6m00=
In-reply-to: <20071102204258.GR995458@xxxxxxx>
References: <200710221505.35397.maximlevitsky@xxxxxxxxx> <393060478.03650@xxxxxxxxxxx> <64bb37e0710310822r5ca6b793p8fd97db2f72a8655@xxxxxxxxxxxxxx> <393903856.06449@xxxxxxxxxxx> <64bb37e0711011120i63cdfe3ci18995d57b6649a8@xxxxxxxxxxxxxx> <64bb37e0711011200n228e708eg255640388f83da22@xxxxxxxxxxxxxx> <E1InmAI-0003ME-2i@localhost> <1193998532.27652.343.camel@twins> <64bb37e0711021222q7d12c825mc62d433c4fe19e8@xxxxxxxxxxxxxx> <20071102204258.GR995458@xxxxxxx>
Sender: xfs-bounce@xxxxxxxxxxx
On 11/2/07, David Chinner <dgc@xxxxxxx> wrote:
> That's stalled waiting on the inode cluster buffer lock. That implies
> that the inode lcuser is already being written out and the inode has
> been redirtied during writeout.
>
> Does the kernel you are testing have the "flush inodes in ascending
> inode number order" patches applied? If so, can you remove that
> patch and see if the problem goes away?

I can now confirm, that I see this also with the current mainline-git-version
I used 2.6.24-rc1-git-b4f555081fdd27d13e6ff39d455d5aefae9d2c0c
plus the fix for the sg changes in ieee1394.
Bisecting would be troublesome, as the sg changes prevent mainline to
boot with my normal config / kill my network.

treogen ~ # vmstat 10
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
-> starting emerge
 1  0      0 3627072    332 157724    0    0    97    13   41  189  2  2 94  2
 0  0      0 3607240    332 163736    0    0   599    10  332  951  2  1 93  4
 0  0      0 3601920    332 167592    0    0   380     2  218  870  1  1 98  0
 0  0      0 3596356    332 171648    0    0   404    21  182  818  0  0 99  0
 0  0      0 3579328    332 180436    0    0   878    12  147  912  1  1 97  2
 0  0      0 3575376    332 182776    0    0   236     4  244  953  1  1 95  3
 2  1      0 3571792    332 185084    0    0   232     7  256 1003  2  1 95  2
 0  0      0 3564844    332 187364    0    0   228   605  246 1167  2  1 93  4
 0  0      0 3562128    332 189784    0    0   230     4  527 1238  2  1 93  4
 0  1      0 3558764    332 191964    0    0   216    24  438 1059  1  1 93  6
 0  0      0 3555120    332 193868    0    0   199    36  406  959  0  0 92  8
 0  0      0 3552008    332 195928    0    0   197    11  458 1023  1  1 90  8
 0  0      0 3548728    332 197660    0    0   183     7  496 1086  1  1 90  8
 0  0      0 3545560    332 199372    0    0   170     8  483 1017  1  1 90  9
 0  1      0 3542124    332 201256    0    0   190     1  544 1137  1  1 88 10
 1  0      0 3536924    332 203296    0    0   195     7  637 1209  2  1 89  8
 1  1      0 3485096    332 249184    0    0   101    16 10372 4537 13  3 76  8
 2  0      0 3442004    332 279728    0    0  1086    40  219 1349  7  3 87  4
-> emerge is done reading its package database
 1  0      0 3254796    332 448636    0    0     0    27  128 8360 24  6 70  0
 2  0      0 3143304    332 554016    0    0    47    33  213 4480 16 11 72  1
-> kernel unpacked
 1  0      0 3125700    332 560416    0    0     1    20  122 1675 24  1 75  0
 1  0      0 3117356    332 567968    0    0     0   674  157 2975 24  2 73  1
 2  0      0 3111636    332 573736    0    0     0  1143  151 1924 23  1 75  1
 2  0      0 3102836    332 581332    0    0     0   890  153 1330 24  1 75  0
 1  0      0 3097236    332 587360    0    0     0   656  194 1593 24  1 74  0
 1  0      0 3086824    332 595480    0    0     0   812  235 2657 25  1 74  0
-> tar.bz2 created, installing starts now
 0  0      0 3091612    332 601024    0    0    82   708  499 2397 17  4 78  1
 0  0      0 3086088    332 602180    0    0    69  2459  769 2237  3  4 88  6
 0  0      0 3085916    332 602236    0    0     2  1752  693  949  1  2 96  1
 0  0      0 3084544    332 603564    0    0    66  4057 1176 2850  3  6 91  0
 0  0      0 3078780    332 605572    0    0    98  3194 1169 3288  5  6 89  0
 0  0      0 3077940    332 605924    0    0    17  1139  823 1547  1  2 97  0
 0  0      0 3078268    332 605924    0    0     0   888  807 1329  0  1 99  0
-> first short stall
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 0  0      0 3077040    332 605924    0    0     0  1950  785 1495  0  2 89  8
 0  0      0 3076588    332 605896    0    0     2  3807  925 2046  1  4 95  0
 0  0      0 3076900    332 606052    0    0    11  2564  768 1471  1  3 95  1
 0  0      0 3071584    332 607928    0    0    87  2499 1108 3433  4  6 90  0
-> second longer stall
(emerge was not able to complete a single filemove until the 'resume' line)
 0  0      0 3071592    332 607928    0    0     0   693  692 1289  0  0 99  0
 0  0      0 3072584    332 607928    0    0     0   792  731 1507  0  1 99  0
 0  0      0 3072840    332 607928    0    0     0   806  707 1521  0  1 99  0
 0  0      0 3072724    332 607928    0    0     0   782  695 1372  0  0 99  0
 0  0      0 3072972    332 607928    0    0     0   677  612 1301  0  0 99  0
 0  0      0 3072772    332 607928    0    0     0   738  681 1352  1  1 99  0
 0  0      0 3073020    332 607928    0    0     0   785  708 1328  0  1 99  0
 0  0      0 3072896    332 607928    0    0     0   833  722 1383  0  0 99  0
-> emerge resumed
 0  0      0 3069476    332 607972    0    0     2  4885  812 2062  1  4 90  5
 1  0      0 3069648    332 608068    0    0     4  4658  833 2158  1  4 93  2
 0  0      0 3064972    332 610364    0    0   106  2494 1095 3620  5  7 88  0
 0  0      0 3057536    332 612444    0    0    86  2023 1012 3440  4  6 90  0
 1  0      0 3054572    332 612368    0    0   102  1526 1024 2277  6  5 87  2
-> emerge finished, but still >100Mb of dirty data accoring to /proc/meminfo
 0  0      0 3048548    332 615764    0    0   337   659  796 1000  3  1 96  0
 0  0      0 3092100    332 615860    0    0    15   616  606 1040  1  0 99  0
 0  0      0 3092148    332 615860    0    0     0   641  622 1085  0  0 99  0
 0  0      0 3092528    332 615860    0    0     0   766  654 1055  1  1 99  0
-> slow writeout until here, might be fixed with Peters patch to scale
the background threshold
 2  0      0 3090828    332 615860    0    0     0  1804  707 1215  0  2 98  0
 0  0      0 3091056    332 615864    0    0     0  3877  831 2047  1  4 94  1
 3  0      0 3090780    332 615864    0    0     0  2048  784 1154  1  2 97  1
 0  0      0 3091096    332 615864    0    0     0  2690  751 1538  0  3 96  1
 0  1      0 3091056    332 615864    0    0     0  2018  748  866  0  2 95  2
 2  0      0 3092960    332 615864    0    0     0  2076  719 1118  0  2 97  0
-> writeout "done", /proc/meminfo showed 0kb of dirty data remaining
 0  0      0 3093072    332 615864    0    0     0   645  646 1104  0  0 99  0
 0  0      0 3093532    332 615864    0    0     0   726  658 1223  0  1 99  0
 0  0      0 3093540    332 615864    0    0     0   801  699 1314  0  1 99  0
 0  0      0 3093580    332 615864    0    0     0   783  738 1350  0  1 99  0
 0  0      0 3093284    332 615920    0    0     6   746  655 1381  1  1 98  0
 0  0      0 3092872    332 615920    0    0     0   862  703 1391  1  1 98  0
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 0  0      0 3093224    332 615920    0    0     0   799  676 1394  0  0 99  0
 0  0      0 3093304    332 615920    0    0     0   835  672 1514  1  1 98  0
 0  0      0 3093476    332 615920    0    0     0   784  641 1404  1  1 98  0
 0  0      0 3093264    332 615920    0    0     0   722  626 1483  1  1 99  0
 0  0      0 3093476    332 615920    0    0     0     7  328  350  0  0 99  0
 0  0      0 3093628    332 615920    0    0     0    11  332  407  0  0 99  0
-> disks finally go idle

Torsten

.config for 2.6.24-rc1+git attached

Attachment: config.gz
Description: GNU Zip compressed data

<Prev in Thread] Current Thread [Next in Thread>