[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: xfs_create looping, missing dirs/files, corrupt inode tables, etc.
yocum@fnal.gov wrote:
>Hi all,
>
>I've been having a problem that I thought might be related to Tridge's, but
>maybe not:
>
>When copying directories over NFS (v2) from OSF1 clients to a Linux server
>with XFS, files and directories will mysteriously "vanish" after the cp
>completes, however, enough of the file/dir remains (i.e., the name of the
>file in the inode table) that an ls of the dir will yield this:
>
>
>[root@sdssdp9 rawdata]# ls CANNOT_RM.51940/
>ls: CANNOT_RM.51940/guider: No such file or directory
>
>
>As the name of the directory suggests, not enough information remains in the
>inode table to rm it, either.
>
>One clue that's in /var/log/messages is this:
>
>
>Aug 18 00:14:10 sdssdp9 kernel: xfs_create looping, dir ino 0xa25e000, ino
>0x101000800, md(9,0)
>Aug 18 00:14:10 sdssdp9 kernel:
>Aug 18 00:14:10 sdssdp9 kernel: nfsd: non-standard errno: -990
>
Are you seeing any other error messages on the linux server?
Error 990 is file system shutdown which is usally the result of a
hardware error.
>
>
>
>I have also been able to reproduce this problem copying files over local
>NFSv2 (i.e., the NFS exported volume is mounted locally) so, this is not
>only an OSF1 -> Linux problem.
>
>At the threat of being beaten with a sharp pointy stick by mkp (since I know
>he *loves* IDE disks ;-), here's the hw and sw configuration of these
>machines:
>
>Intel STL2 mobo
>2 866MHz P-III
>512MB RAM
>Netgear GA620 Gig-e NIC
>2 3ware 7810 8 port RAID controller cards (configured as RAID5)
>1 3ware 6200 2 port RAID controller card (configured as RAID1 for system
>disk)
>16 Maxtor 81.9GB disks (for data)
>
I remeber somebody mentioning the 3ware raid controllers are know to
have problems... I think it was Martin Peterson.
>
>
>SGI modified Red Hat 7.1 distribution
>linux-2.4.8-xfs from CVS (Thursday, Aug 16)
>
>I've configured each 3ware 7810 card as RAID5, then I'm using sw RAID0 to
>stripe across the devices yielding a 1.12TB filesystem.
>
>I'd try the same tests to a system that is not using IDE disks, but we don't
>have any of those. (Sorry, Martin)
>
>I've performed an xfs_repair on the device and everything was fixed up,
>files were saved off into lost+found, inodes cleared, etc. and I've attached
>the output if anyone is interested.
>
>Thanks,
>Dan
>
>