xfs
[Top] [All Lists]

Re: linux-xfs@xxxxxxxxxxx

To: sooo lame <kszysiu@xxxxxxxxxxxx>
Subject: Re: linux-xfs@xxxxxxxxxxx
From: Steve Lord <lord@xxxxxxx>
Date: Mon, 12 Feb 2001 09:19:44 -0600
Cc: linux-xfs@xxxxxxxxxxx
In-reply-to: Message from sooo lame <kszysiu@xxxxxxxxxxxx> of "Sun, 11 Feb 2001 19:12:58 +0100." <20010211191258.A19510@xxxxxxxxxxxxxxxxx>
Sender: owner-linux-xfs@xxxxxxxxxxx
> 
> I, similarly to Sean and Utz, had lockups and data loss on my 2 of 4 XFS
> hard drives (one machine). After one lockup one partition refused
> to be mounted
> 
> These were non-root partitions..
> What can seem important: XFS disks were NOT heaviliy used, moreover
> there was practically no writes to that disks...
> 
> Feb  7 12:39:08 main kernel: Start mounting filesystem: ide2(33,1) 
> Feb  7 12:39:08 main kernel: Starting XFS recovery on filesystem: ide2(33,1)
> (dev: 33/1) 
> Feb  7 12:39:08 main kernel: cmn_err level 1 Filesystem "ide2(33,1)":
> xfs_inode_recover: Bad inode magic numbe
> r, dino ptr = 0xc2fc6600, dino bp = 0xc78d8c60, ino = 65456518 
> Feb  7 12:39:08 main kernel: XFS: log mount/recovery failed 
> Feb  7 12:39:08 main kernel: XFS: log mount failed 
> 
> (i change kernels as soon as CVS, so the "vital" change must have
> occured 'bout Feb 6 - unfortunately i haven't kept previous tree)

I did change how delayed allocate writes are processed - on Feb 5th,
I suspect this as being related to these problems - the data loss after
an oops especially. We used to scan the complete page table converting
pages to delayed allocate, now we scan the inactive dirty list, of course
if a page never gets to the inactive dirty list we could be in trouble....

It is possible I fixed the oops on Friday, I spent most of last week
chasing a page table corruption problem which was introduced in the 2.4.1
merge.  For me this came out as a panic freeing a page, but for others it
could well look different - since a page table entry was being filled in
with an uninitialized stack variable.

Steve


> 
> After xfs_recovery few files/dirs were _lost_ (as i noticed it was
> _NOT_ lost inodes ... lost+found directory was still empty), but,
> what's interesting free disk space reported by "df" did NOT
> change (files were 'bout 500MB each so change would be noticeable).
> 
> What's also interesting, another (a friend of mine) machine with _same_
> (buggy?) kernel had no lockups at all - no data loss consequently...
> 
> my machine: Mendocino 366 , 128MB, VT82C596A & PROMISE PDC20262
>             Samsung Spinpoint Series Hard drives (SVxxxxD , 20,30 and 40G)
>             Davicom DM9102 Fast Ethernet
> 
> friend's machine: Mendocino 400, 256MB, VT82C596B
>               Seagate ST313021A, IBM-DTLA-307030
>               3Com 3C509C Tornado
> 
> Can it be related to "serious IDE multimode write bug" fixed in 2.4.2-pre2 ?

I am not an expert on the ins and outs of the ide system, but I seem to recall
people reporting problems with promise ide controllers on the list - I
see the word promise in your machine spec, or is this grasping at straws?

Steve



<Prev in Thread] Current Thread [Next in Thread>