xfs
[Top] [All Lists]

XFS dying when many processes copy many files/directories

To: linux-xfs@xxxxxxxxxxx
Subject: XFS dying when many processes copy many files/directories
From: Adrian Head <ahead@xxxxxxxxxxxxxx>
Date: Mon, 17 Dec 2001 10:21:29 +1000
Sender: owner-linux-xfs@xxxxxxxxxxx
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I am again in the process of building a couple of file servers for various 
purposes over the last week and thought that I'd have another serious look at 
using XFS on the new file servers.

Base system is Redhat 7.1 with a custom kernel - 2.4.16+xfs.

My standard process before putting servers into production is to run a few 
tests to make sure that I can trust the hardware/software combination.  One 
of my standard tests is to simulate many users simultaniously copying many 
files across the filesystem.  The volume is a software raid5 over 4 IDE 
drives.

It is during this test that the machine hangs after getting almost 95% 
complete.  I have tried running this test using XFS, ext2, ext3, reiserfs and 
only XFS fails to complete.  This situation is completely reproducable every 
time I have run this test to date.

I'm at a loss as to what to do next to troubleshoot this problem or even what 
info people need.


Attached is some info:


- -- 
Adrian Head

(Public Key available on request.)

#==============

I do this test with the following simple script which just starts many cp 
processes in the background (Directories in this test are ~ 266M each):

#=========
#!/bin/sh

cp -fr 01 2

for (( i=80; i!=2; i-- )) ; do
  cp -fr 01 $i &
#  echo $i
done
#=========

A du shows that it almost got there ... but not quite.... (Directories are 
~266M each)

du -sh *
266M    01
209M    10
212M    11
217M    12
217M    13
211M    14
209M    15
218M    16
217M    17
212M    18
219M    19
266M    2
210M    20
etc...
.....................................

This is just a list of what is running before the machine hangs.

init-+-atd
     |-78*[cp]
     |-crond
     |-keventd
     |-klogd
     |-mdrecoveryd
     |-6*[mingetty]
     |-pagebuf_daemon
     |-portmap
     |-raid5d
     |-raid5syncd
     |-rpc.statd
     |-sshd-+-sshd---bash---pstree
     |      `-sshd---bash---top
     `-syslogd

This is the top screen after the server has hung. 

  5:44am  up 1 day,  5:47,  2 users,  load average: 82.11, 82.11, 81.84
111 processes: 87 sleeping, 24 running, 0 zombie, 0 stopped
CPU states:  0.1% user, 99.8% system,  0.0% nice,  0.0% idle
Mem:   384208K av,  382032K used,    2176K free,       0K shrd,     204K buff
Swap:  524624K av,   29156K used,  495468K free                  302292K 
cached

  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM  CTIME COMMAND
    6 root      19   0     0    0     0 RW   12.7  0.0  14:43 kupdated
 5213 root      17   0   704  388   324 R    12.6  0.1   0:12 cp
 5205 root      19   0   600  304   236 R    12.4  0.0   0:12 cp
 5214 root      15   0   740  388   312 D    12.3  0.1   0:11 cp
 5178 root      17   0   560  284   136 R     9.1  0.0   0:11 cp
    4 root      17   0     0    0     0 RW    9.0  0.0   0:53 kswapd
 5173 root      16   0   540  276   140 R     9.0  0.0   0:15 cp
 5196 root      20   0   484  220    84 R     8.8  0.0   0:11 cp
 5200 root      16   0   536  264   180 R     8.7  0.0   0:10 cp
 5199 root      17   0   544  264   160 R     8.6  0.0   0:15 cp
 5189 root      16   0   600  376   232 R     8.5  0.0   0:16 cp
 5226 root      16   0   760  476   368 R     8.5  0.1   0:17 cp
 5182 root      18   0   524  248   136 R     8.4  0.0   0:11 cp
 5191 root      16   0   496  252   128 R     8.4  0.0   0:14 cp
 5206 root      16   0   616  332   220 R     8.3  0.0   0:17 cp
 5167 root      10   0   560  260   140 D     8.1  0.0   0:12 cp
 5225 root       9   0   744  432   348 D     8.1  0.1   0:11 cp
 5210 root      17   0   640  328   228 R     8.0  0.0   0:11 cp
 5219 root      16   0   692  404   312 R     8.0  0.1   0:16 cp
 5168 root      12   0   552  252   132 D     7.9  0.0   0:10 cp
 4984 root      15   0   640  580   396 R     7.8  0.1   0:31 top
 5176 root       9   0   536  296   168 D     7.0  0.0   0:13 cp
 5192 root      16   0   524  232    96 R     6.8  0.0   0:10 cp
 5175 root      20   0   564  264   128 R     6.6  0.0   0:10 cp
 5190 root      18   0   616  372   216 R     6.4  0.0   0:11 cp
 5174 root       9   0   520  296   180 D     6.0  0.0   0:12 cp
 5179 root      20   0   520  304   196 R     6.0  0.0   0:13 cp
 5165 root       9   0   524  236   136 D     3.7  0.0   0:11 cp
    1 root      11   0   212  156   144 S     2.8  0.0  25:44 init
 5229 root       9   0   744  452   364 D     2.7  0.1   0:10 cp
 5238 root      17   0   788  484   420 R     2.3  0.1   0:11 cp
  771 root       9   0   260  204   160 S     2.1  0.0   0:36 crond
 5231 root      18   0   804  520   400 R     1.9  0.1   0:17 cp
  571 root       9   0   252  200   176 S     1.3  0.0   0:00 syslogd
 5362 root      19  19  1220  452   452 R N   0.6  0.1   0:02 updatedb
 5186 root       9   0   596  352   248 D     0.2  0.0   0:13 cp
 5239 root       9   0   824  556   444 D     0.1  0.1   0:12 cp
    2 root       9   0     0    0     0 SW    0.0  0.0   0:08 keventd
    3 root      19  19     0    0     0 SWN   0.0  0.0   0:00 ksoftirqd_CPU0
    5 root       9   0     0    0     0 SW    0.0  0.0   5:24 bdflush
    7 root       9   0     0    0     0 SW    0.0  0.0   0:00 pagebuf_daemon
    8 root      -1 -20     0    0     0 SW<   0.0  0.0   0:00 mdrecoveryd
  136 root      -1 -20     0    0     0 SW<   0.0  0.0  13:30 raid5d
  137 root      19  19     0    0     0 SWN   0.0  0.0   1:07 raid5syncd
  576 root       9   0   708   52    52 S     0.0  0.0   0:00 klogd
  590 rpc        9   0   144   52    52 S     0.0  0.0   0:00 portmap
  605 rpcuser    9   0   164   60    60 S     0.0  0.0   0:00 rpc.statd
  691 daemon     9   0   144   72    72 S     0.0  0.0   0:00 atd
  703 root       9   0   332  148   148 S     0.0  0.0   0:00 sshd
  798 root       9   0   152   92    92 S     0.0  0.0   0:00 mingetty
  799 root       9   0   152   92    92 S     0.0  0.0   0:00 mingetty

- -- 
Adrian Head

(Public Key available on request.)

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE8HTqN8ZJI8OvSkAcRAqH9AJ9hxflaTLYvpoj+BGq8//iXhN2YNwCfb7Fq
edVPugYfBaRjRoxJDyLkHkc=
=18g5
-----END PGP SIGNATURE-----


<Prev in Thread] Current Thread [Next in Thread>