xfs
[Top] [All Lists]

Re: Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestre

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?)
From: Yann Dupont <Yann.Dupont@xxxxxxxxxxxxxx>
Date: Mon, 29 Oct 2012 09:07:28 +0100
Cc: xfs@xxxxxxxxxxx
In-reply-to: <20121028234802.GE4353@dastard>
References: <508554AF.5050005@xxxxxxxxxxxxxx> <50865453.5080708@xxxxxxxxxxxxxx> <508958FF.4000007@xxxxxxxxxxxxxx> <20121025211047.GD29378@dastard> <508A600C.1020109@xxxxxxxxxxxxxx> <508B092E.6070209@xxxxxxxxxxxxxx> <20121028234802.GE4353@dastard>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:16.0) Gecko/20121017 Thunderbird/16.0.1
Le 29/10/2012 00:48, Dave Chinner a écrit :

This is reproductible. Here is how to do it :

- Started a 3.6.2 kernel.

- I created a fresh lvm volume on localdisk of 20 GB.
Can you reproduce the problem without LVM?

Hello dave. That is THE question. My intent was to test with and without LVM. But right now I can't , because all my disks are consumed by lvm. In fact my test setup hasn't even enough space to locally clone the volume where I have errors. I only have 146 sas disks on this machine.

I have to setup another test platform, and as I'm currently traveling, it won't be easy before next week.

What I want to try this week is to recrash a volume, possibly smaller,
download this image on another machine with kvm
see if I have the mounting problems inside this kvm
begin to bisect the kernel .

...

- After some try, I finally had the impossibility to mount the xfs
volume, with the error reported in previous mails. So far this is
normal .
So it doesn't happen every time, and it may be power cycle related.

Yes, during my tests, I had to to power cycle 3 or 4 times before having the actual problem

What is your "local disk"?

Raid 1 array (2 disks) with mptsas on this.

xfs_logprint don't say much :

xfs_logprint:
     data device: 0xfe02
     log device: 0xfe02 daddr: 10485792 length: 20480

Header 0x7c wanted 0xfeedbabe
**********************************************************************
* ERROR: header cycle=124         block=5414 *
**********************************************************************
You didn't look past the initial error, did you? The file is only
482280 lines long, and 482200 lines of that are decoded log data....
:)
Well I'd tried with -c but sorry, i didn't had experience with xfs_logprint so far.

I tried xfs_logprint -c , it gaves a 22M file. You can grab it here :
http://filex.univ-nantes.fr/get?k=QnBXivz2J3LmzJ18uBV
I really need the raw log data, not the parsed output. The logprint
command to do that is "-C <file>", not "-c".

Ok ... I should have read the man page more carefully. Time to restart a crash session

- Rebooted 3.4.15
- xfs_logprint gives the exact same result that with 3.6.2 (diff
tells no differences)
Given that it's generated by the logprint application, I'd expect it
to be identical.
Me too, but I'd also expect the log replaying to be identical between the 2 kernels


but on 3.4.15, I can mount the volume without problem, log is
replayed.
for information here is xfs_info of the volume :

here is xfs_info output

root@label5:/mnt/debug# xfs_info /mnt/tempo
meta-data=/dev/mapper/LocalDisk-crashdisk isize=256    agcount=8,
agsize=655360 blks
How did you get a default of 8 AGs? That seems wrong.  What version
of mkfs.xfs are you using?
root@label5:~# mkfs.xfs -V
mkfs.xfs version 3.1.7

the volume was freshly formatted, with defaults options. Absolutely nothing special on my side.
Cheers,

--
Yann Dupont - Service IRTS, DSI Université de Nantes
Tel : 02.53.48.49.20 - Mail/Jabber : Yann.Dupont@xxxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>