[Top] [All Lists]

Re: Xfs Access to block zero exception and system crash

To: Sagar Borikar <Sagar_Borikar@xxxxxxxxxxxxxx>
Subject: Re: Xfs Access to block zero exception and system crash
From: Eric Sandeen <sandeen@xxxxxxxxxxx>
Date: Sun, 06 Jul 2008 14:07:40 -0500
Cc: Dave Chinner <david@xxxxxxxxxxxxx>, Nathan Scott <nscott@xxxxxxxxxx>, xfs@xxxxxxxxxxx
In-reply-to: <340C71CD25A7EB49BFA81AE8C839266702A084A6@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
References: <486B01A6.4030104@xxxxxxxxxxxxxx> <20080702051337.GX29319@disturbed> <486B13AD.2010500@xxxxxxxxxxxxxx> <1214979191.6025.22.camel@xxxxxxxxxxxxxxxxxx> <20080702065652.GS14251@xxxxxxxxxxxxxxxxxxxxx> <486B6062.6040201@xxxxxxxxxxxxxx> <486C4F89.9030009@xxxxxxxxxxx> <486C6053.7010503@xxxxxxxxxxxxxx> <486CE9EA.90502@xxxxxxxxxxx> <486DF8F0.5010700@xxxxxxxxxxxxxx> <20080704122726.GG29319@disturbed> <340C71CD25A7EB49BFA81AE8C839266702997641@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> <486E5F4D.1010009@xxxxxxxxxxx> <340C71CD25A7EB49BFA81AE8C839266702997658@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> <486FA095.1050106@xxxxxxxxxxx> <340C71CD25A7EB49BFA81AE8C839266702A084A6@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
Sender: xfs-bounce@xxxxxxxxxxx
User-agent: Thunderbird (Macintosh/20080421)
Sagar Borikar wrote:
> Sagar Borikar wrote:
>> Copy is of the same file to 30 different directories and it is
> basically
>> overwrite.
>> Here is the setup:
>> It's a JBOD with Volume size 20 GB. The directories are empty and this
>> is basically continuous copy of the file on all thirty directories.
> But
>> surprisingly none of the copy succeeds. All the copy processes are in 
>> Uninterruptible sleep state and xfs_repair log I have already attached
>> With the prep. As mentioned it is with 2.6.24 Fedora kernel.
> It would probably be best to try a 2.6.26 kernel from rawhide to be sure
> you're closest to the bleeding edge.
> <Sagar> Sure Eric but I reran the test and I got similar errors with
> 2.6.24 kernel on x86. I am still confused with the results that I see on
> 2.6.24 kernel on x86 machine. I see that the used size shown by ls is
> way too huge than the actual size. Here is the log of the system
> [root@lab00 ~/test_partition]# ls -lSah
> total 202M
> -rw-r--r--  1 root root 202M Jul  4 14:06 original ---> this I sthe file
> Which I  copy.
> drwxr-x--- 65 root root  12K Jul  6 21:57 ..
> -rwxr-xr-x  1 root root  189 Jul  4 16:31 runall
> -rwxr-xr-x  1 root root   50 Jul  4 16:32 copy
> drwxr-xr-x  2 root root   45 Jul  6 22:07 .

It'd be great if you provided these actual scripts so we don't have to
guess at what you're doing or work backwards from the repair output :)

> dmesg log doesn't give any information. Here is XFS related
> info:
> XFS mounting filesystem loop0
> Ending clean XFS mount for filesystem: loop0
> Which is basically for mounting XFS cleanly. But there is no exception
> in XFS. 

and nothing else of interest either?

> Filesystem has become completely sluggish and response time is increased
> to 
> 3-4 minutes for every command.  Not a single copy is complete and all
> the copy processes are sleeping continuously. 

And how did you recover from this; did you power-cycle the box?


<Prev in Thread] Current Thread [Next in Thread>