xfs
[Top] [All Lists]

Re: unable to mount filesystem

To: Marcel de Riedmatten <mdr@xxxxxxxxxxx>
Subject: Re: unable to mount filesystem
From: Tim Shimmin <tes@xxxxxxx>
Date: Mon, 6 Oct 2003 16:54:23 +1000
Cc: Linux xfs list <linux-xfs@xxxxxxxxxxx>
In-reply-to: <1065286302.20458.44.camel@galadriel>; from mdr@xxxxxxxxxxx on Sat, Oct 04, 2003 at 06:51:42PM +0200
References: <1065286302.20458.44.camel@galadriel>
Sender: linux-xfs-bounce@xxxxxxxxxxx
Hi Marcel,

On Sat, Oct 04, 2003 at 06:51:42PM +0200, Marcel de Riedmatten wrote:
> 
> I was making write tests on a 1.5TB xfs filesystem and after unmounting
> it i can't mount it again. It looks like a log corruption or something.
> This is what i found in syslog: i try to mount multiple times
> 
> 
> 
> Oct  4 17:15:35 grobelix kernel: XFS mounting filesystem sd(8,1)
> Oct  4 17:15:36 grobelix kernel: Starting XFS recovery on filesystem:
> sd(8,1) (dev: 8/1)
> Oct  4 17:15:36 grobelix kernel: XFS: log mount/recovery failed
> Oct  4 17:15:36 grobelix kernel: XFS: log mount failed
> Oct  4 17:15:50 grobelix kernel: XFS mounting filesystem sd(8,1)
> Oct  4 17:15:50 grobelix kernel: Filesystem "sd(8,1)": XFS internal
> error
> xlog_clear_stale_blocks(2) at line 1253 of file xfs_log_recover.c. 
> Caller
...
> (0xe8539c54))
> Oct  4 17:15:50 grobelix kernel: [<f898e900>] xlog_find_tail [xfs] 0x3c0
> (0xe8539c5c))
...
> This is xfs-1.3 whith   Axel Thimm smp i686 package at 
> http://atrpms.physik.fu-berlin.de/dist/rh73/kernel/
> 
> 
> Hardware is supermicro dual xeon board (x5dp8-g2) with 2 GB of ram and a
> 3ware 8506-8 with 7 disks raid5. 
> 
> 
> Then try xfs_logprint (without -t ) but it segfault:
> 

xfs_logprint without -t prior to xfsprogs-2.5.10 couldn't handle v2 logs.
From xfsprogs/doc/CHANGES:
    xfsprogs-2.5.10 (30 September 2003)
            - Fix up xfs_logprint to handle version 2 logs for its
              operation output (previously core dumped on it).

> # xfs_info data
> meta-data=/exports/data          isize=256    agcount=351,
> agsize=1048576 blks
>          =                       sectsz=512  
> data     =                       bsize=4096   blocks=367671614,
> imaxpct=25
>          =                       sunit=16     swidth=96 blks,
> unwritten=1
> naming   =version 2              bsize=4096  
> log      =internal               bsize=4096   blocks=32736, version=2
>          =                       sectsz=512   sunit=48 blks
> realtime =none                   extsz=65536  blocks=0, rtextents=0
> 
I noticed that you are using version 2 logs with a striped size of 48 4K-blocks
or 8*48=384 BBs (512 byte blocks).
I believe there is a bug in the code for log sunit which is not
a power of 2 in BBs.
xfs_log.c: log->l_stripemask = 1 << xfs_highbit32(mp->m_sb.sb_logsunit >> 
BBSHIFT); 

I have a fix in my own code but it is mixed with some other v2 log changes. 
I'm currently writing v2 log QA to test this stuff out.
I'll let you know when I check this stuff in.

> xlog_clear_stale_blocks(2) at line 1253 of file xfs_log_recover.c. 
This error happens (great msg I think not:) if the tail of the log
is greater than the head. By greater I mean in terms of <block#, cycle#>.
(So if the cycle #s of the head/tail are the same then 
 the tail blk < head blk otherwise
 the tail cycle# should be one less than the head cycle# since
 the tail must be on the previous cycle.
And the tail should never be greater than the head.

> xfs_logprint:
>     data device: 0x801
>     log device: 0x801 daddr: 1468006656 length: 261888
> 
>     log tail: 60416 head: 65279 state: <DIRTY>
> 
> 
> LOG REC AT LSN cycle 7 block 60416 (0x7, 0xec00)
> ============================================================================
> TRANS: tid:0xc45c87bc  type:DIOSTRAT  #items:5  trans:0x0  q:0x808cf70
>
It looks like the tail blk# (60416) is less than the head blk# (65279).
In which case it is likely that the cycle numbers are not the same like
they should be.


--Tim


<Prev in Thread] Current Thread [Next in Thread>