[Top] [All Lists]

Re: Xfs Access to block zero exception and system crash

To: Eric Sandeen <sandeen@xxxxxxxxxxx>
Subject: Re: Xfs Access to block zero exception and system crash
From: Sagar Borikar <sagar_borikar@xxxxxxxxxxxxxx>
Date: Mon, 07 Jul 2008 09:28:53 +0530
Cc: Dave Chinner <david@xxxxxxxxxxxxx>, Nathan Scott <nscott@xxxxxxxxxx>, xfs@xxxxxxxxxxx
In-reply-to: <487191C2.6090803@xxxxxxxxxxx>
Organization: PMC Sierra Inc
References: <486B01A6.4030104@xxxxxxxxxxxxxx> <20080702051337.GX29319@disturbed> <486B13AD.2010500@xxxxxxxxxxxxxx> <1214979191.6025.22.camel@xxxxxxxxxxxxxxxxxx> <20080702065652.GS14251@xxxxxxxxxxxxxxxxxxxxx> <486B6062.6040201@xxxxxxxxxxxxxx> <486C4F89.9030009@xxxxxxxxxxx> <486C6053.7010503@xxxxxxxxxxxxxx> <486CE9EA.90502@xxxxxxxxxxx> <486DF8F0.5010700@xxxxxxxxxxxxxx> <20080704122726.GG29319@disturbed> <340C71CD25A7EB49BFA81AE8C839266702997641@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> <486E5F4D.1010009@xxxxxxxxxxx> <340C71CD25A7EB49BFA81AE8C839266702997658@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> <486FA095.1050106@xxxxxxxxxxx> <340C71CD25A7EB49BFA81AE8C839266702A084A6@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> <487117FC.9090109@xxxxxxxxxxx> <4871872B.9060107@xxxxxxxxxxxxxx> <487187D2.8080105@xxxxxxxxxxx> <4871885B.6090208@xxxxxxxxxxxxxx> <48718977.1090005@xxxxxxxxxxx> <48718AB6.80709@xxxxxxxxxxxxxx> <48718BF0.2040700@xxxxxxxxxxx> <48719093.3060907@xxxxxxxxxxxxxx> <487191C2.6090803@sandeen .net>
Sender: xfs-bounce@xxxxxxxxxxx
User-agent: Thunderbird (X11/20080421)

Eric Sandeen wrote:
Sagar Borikar wrote:

Ok. So initially our multi client iozone stress test used to fail.

Are these multiple nfs clients?
Actually mix of them. 15 CIFS clients, 4 NFS clients ( 19 iozone clients ) , 2 FTP clients,
4 HTTP transfers. ( Total 25 transactions simultaneously )
But as it took 2-3 days to replicate the issue, I tried the test, standalone on MIPS and

the iozone test again?
iozone test is continuously giving the access to block zero exception and xfs shutdown errors with transaction cancel exceptions plus alloc btree corruption exception which I reported earlier. And my test gives transaction cancel exception and block zero exception with processes under test in deadlock state on MIPS but on x86 there are no exceptions but
only incomplete copies due to uninterruptible sleep state and deadlock.
observed similar failures which
I used to get in multi client test. The test is exactly same what I do in mutli client iozoen over network. Hence I came to conclusion that if we fix system to pass my test case then we can try iozone test with that fix. And now on x86 with 2.6.24, I am finding similar deadlock but the system is responsive and there are no lockups or exceptions. Do you observe similar failures on x86 at your setup?

So far I've not seen the deadlocks.
Could you kindly try with my test? I presume you should see failure soon. I tried this on 2 different x86 systems 2 times ( after rebooting the system ) and I saw it every time.
Also do you think the issues which I am seeing on x86 and MIPS are coming from the
same sources?

hard to say at this point, I think.



<Prev in Thread] Current Thread [Next in Thread>