xfs
[Top] [All Lists]

Re: Problems with many processes copying large directories acrossan XFS

To: Adrian Head <adrian.head@xxxxxxxxxxxxxxx>
Subject: Re: Problems with many processes copying large directories acrossan XFS volume.
From: Simon Matter <simon.matter@xxxxxxxxxxxxxxxx>
Date: Mon, 10 Sep 2001 16:28:16 +0200
>received: from mobile.sauter-bc.com (unknown [10.1.6.21]) by basel1.sauter-bc.com (Postfix) with ESMTP id F112057306; Mon, 10 Sep 2001 16:28:16 +0200 (CEST)
Cc: linux-xfs@xxxxxxxxxxx
Organization: Sauter AG, Basel
References: <D1F276F384A0D311A7C500A0C9E89B381980E1@xxxxxxxxxxxx>
Sender: owner-linux-xfs@xxxxxxxxxxx
Adrian Head schrieb:
> 
> Thanks for your reply Simon
> 
> Yes the softraid was fully synced before I started any test.
> 
> The XFS patch I used to obtain these errors was
> patch-2.4.9-xfs-2001-08-19 and the errors were:
> Sep 9 05:13:46 ATLAS kernel: 02:86: rw=0, want=156092516, limit=360
> Sep 9 05:13:46 ATLAS kernel: attempt to access beyond end of device
> 
> When I used a later version of the XFS patch I had more descriptive
> errors written to /var/log/messages:
> Sep 10 10:14:57 ATLAS kernel: I/O error in filesystem ("md(9,0)")
> meta-data  dev 0x900 block 0x9802bdc
> Sep 10 10:14:57 ATLAS kernel: (xlog_iodone") error  5 buf count 32768
> Sep 10 10:14:57 ATLAS kernel:  xfs_force_shutdown(md(9,0),0x2) called
> from line 940 of file xfs_log.c.  Return address - 0xd8cb66f8
> Sep 10 10:14:57 ATLAS kernel: Log I/O Error  Detected. Shutting down
> filesystem: md(9,0)
> Sep 10 10:14:57 ATLAS kernel:  Please umount the filesystem, and rectify
> the problem(s)
> Sep 10 10:14:57 ATLAS kernel: xfs_force_shutdown(md(9,0),0x2) called
> from line 714 of file  xfs_log.c. Return address = 0xd8cb65d3
> Sep 10 10:14:57 ATLAS kernel: attempt  to access beyond end of device
> Sep 10 10:14:57 ATLAS kernel: 02:82: rw=0,  want=1602235696, limit=4
> 
> I did think at the time that it may have been issues with XFS stomping
> all over raid code or raid code stomping all over XFS.  Although I not
> sure now as the 2.4.10-pre2-xfs-2001-09-02 patch never wrote any errors
> out at all. (please see my 2nd post for more info)
> 
> Thanks for taking the time to test this on your own machine.

I tried 20, 40 and 80 simultanous cp with no crash. Then I changed the
file tree and the new tree has ~280M small files with 100b-50kb size.
When using 60 cp jobs the machine died. I could ping it but nothing
more. No ssh, no console, no shutdown. I try some more tests tonight. I
try the same with ext2 as well to make sure it's XFS and not Softraid.

-Simon

> 
> Adrian Head
> Bytecomm P/L
> 
> > -----Original Message-----
> > From: Simon Matter [SMTP:simon.matter@xxxxxxxxxxxxxxxx]
> > Sent: Monday, 10 September 2001 17:45
> > To:   adrian.head@xxxxxxxxxxxxxxx
> > Cc:   linux-xfs@xxxxxxxxxxx
> > Subject:      Re: Problems with many processes copying large
> > directories across an XFS  volume.
> >
> > Hi Adrian
> >
> > I did similar tests two months ago. I was having problems as well but
> > ufurtunately I don't remember what is was exactly.
> > First question: You created Softraid5, was the raid synced when you
> > started the tests?
> >
> > > In the /var/log/messages log around the same time as the copy test I
> > get
> > > entries like:
> > > Sep 9 05:13:46 ATLAS kernel: 02:86: rw=0, want=156092516, limit=360
> > > Sep 9 05:13:46 ATLAS kernel: attempt to access beyond end of device
> >
> > This looks interesting. I don't know what this means exactly but it
> > looks to me like you managed to create a filesystem bigger than the
> > raid
> > volume was? I got the very same error when I tried to restore data
> > with
> > xfsrestore from DAT (xfsrestore from DLT was fine). The issue is still
> > open.
> >
> > I have a test system here with SoftRAID5 on 4 U160 SCSI disks. I'll
> > try
> > to kill it today with cp jobs.
> >
> > -Simon
> >

-- 
Simon Matter              Tel:  +41 61 695 57 35
Fr.Sauter AG / CIT        Fax:  +41 61 695 53 30
Im Surinam 55
CH-4016 Basel             [mailto:simon.matter@xxxxxxxxxxxxxxxx]



<Prev in Thread] Current Thread [Next in Thread>