xfs
[Top] [All Lists]

Re: Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestre

To: xfs@xxxxxxxxxxx
Subject: Re: Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?)
From: Yann Dupont <Yann.Dupont@xxxxxxxxxxxxxx>
Date: Thu, 25 Oct 2012 22:55:05 +0200
In-reply-to: <508958FF.4000007@xxxxxxxxxxxxxx>
References: <508554AF.5050005@xxxxxxxxxxxxxx> <50865453.5080708@xxxxxxxxxxxxxx> <508958FF.4000007@xxxxxxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:16.0) Gecko/20121017 Thunderbird/16.0.1
Le 25/10/2012 17:21, Yann Dupont a écrit :
Hello.
There is definitively something wrong in 3.6.xx with XFS, in particular after an abrupt stop of the machine :

I now have corruption on a 3rd machine (not involved with ceph).
The machine was just rebooting from 3.6.2 kernel to 3.6.3 kernel.

This machine isn't under heavy load, but it's a machine we use for tests & compilations. We often crash it. For 2 years, we didn't have problems. XFS always was reliable, even in hard conditions (hard reset, loss of power, etc)

This time, after 3.6.3 boot, one of my xfs volume refuse to mount :

mount: /dev/mapper/LocalDisk-debug--git: can't read superblock

276596.189363] XFS (dm-1): Mounting Filesystem
[276596.270614] XFS (dm-1): Starting recovery (logdev: internal)
[276596.711295] XFS (dm-1): xlog_recover_process_data: bad clientid 0x0
[276596.711329] XFS (dm-1): log mount/recovery failed: error 5
[276596.711516] XFS (dm-1): log mount failed


Just found something interesting :

I was rebooting with 3.4.15 to make a backup of this volume. As I said in previous message, I didn't did xfs_repair on it.
Before reboot, I forgot to edit fstab to prevent the mount.
To my surprise, under 3.4.15 the volume mounts like a charm !!!

[   37.958374] XFS (dm-1): Mounting Filesystem
[   38.050374] XFS (dm-1): Starting recovery (logdev: internal)
[   69.596892] XFS (dm-1): Ending recovery (logdev: internal)

As far as I can say, there is no corruption, no problems, all my files are here !!!

So far here is the scenario :

You have to hard reset your machine with 3.6 (maybe kernel version isn't important here). As I encoutered others 3.6 Bugs (exit_mm and rwsem_down_failed_common) , I had to do that.

So XFS is not clean.

2) boot with 3.6.xx
Mounting volume fails, bacause log replay fails for an unkwown reason

3) You think your FS is broken, so you start an xfs_repair, which is somehow fooled and definitively broke your filesystem

I hope it's reproductible. Will try tomorrow morning.

Cheers,

--
Yann Dupont - Service IRTS, DSI Université de Nantes
Tel : 02.53.48.49.20 - Mail/Jabber : Yann.Dupont@xxxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>