Hi Marcel,
On Sat, Oct 04, 2003 at 06:51:42PM +0200, Marcel de Riedmatten wrote:
>
> I was making write tests on a 1.5TB xfs filesystem and after unmounting
> it i can't mount it again. It looks like a log corruption or something.
> This is what i found in syslog: i try to mount multiple times
>
>
>
> Oct 4 17:15:35 grobelix kernel: XFS mounting filesystem sd(8,1)
> Oct 4 17:15:36 grobelix kernel: Starting XFS recovery on filesystem:
> sd(8,1) (dev: 8/1)
> Oct 4 17:15:36 grobelix kernel: XFS: log mount/recovery failed
> Oct 4 17:15:36 grobelix kernel: XFS: log mount failed
> Oct 4 17:15:50 grobelix kernel: XFS mounting filesystem sd(8,1)
> Oct 4 17:15:50 grobelix kernel: Filesystem "sd(8,1)": XFS internal
> error
> xlog_clear_stale_blocks(2) at line 1253 of file xfs_log_recover.c.
> Caller
...
> (0xe8539c54))
> Oct 4 17:15:50 grobelix kernel: [<f898e900>] xlog_find_tail [xfs] 0x3c0
> (0xe8539c5c))
...
> This is xfs-1.3 whith Axel Thimm smp i686 package at
> http://atrpms.physik.fu-berlin.de/dist/rh73/kernel/
>
>
> Hardware is supermicro dual xeon board (x5dp8-g2) with 2 GB of ram and a
> 3ware 8506-8 with 7 disks raid5.
>
>
> Then try xfs_logprint (without -t ) but it segfault:
>
xfs_logprint without -t prior to xfsprogs-2.5.10 couldn't handle v2 logs.
From xfsprogs/doc/CHANGES:
xfsprogs-2.5.10 (30 September 2003)
- Fix up xfs_logprint to handle version 2 logs for its
operation output (previously core dumped on it).
> # xfs_info data
> meta-data=/exports/data isize=256 agcount=351,
> agsize=1048576 blks
> = sectsz=512
> data = bsize=4096 blocks=367671614,
> imaxpct=25
> = sunit=16 swidth=96 blks,
> unwritten=1
> naming =version 2 bsize=4096
> log =internal bsize=4096 blocks=32736, version=2
> = sectsz=512 sunit=48 blks
> realtime =none extsz=65536 blocks=0, rtextents=0
>
I noticed that you are using version 2 logs with a striped size of 48 4K-blocks
or 8*48=384 BBs (512 byte blocks).
I believe there is a bug in the code for log sunit which is not
a power of 2 in BBs.
xfs_log.c: log->l_stripemask = 1 << xfs_highbit32(mp->m_sb.sb_logsunit >>
BBSHIFT);
I have a fix in my own code but it is mixed with some other v2 log changes.
I'm currently writing v2 log QA to test this stuff out.
I'll let you know when I check this stuff in.
> xlog_clear_stale_blocks(2) at line 1253 of file xfs_log_recover.c.
This error happens (great msg I think not:) if the tail of the log
is greater than the head. By greater I mean in terms of <block#, cycle#>.
(So if the cycle #s of the head/tail are the same then
the tail blk < head blk otherwise
the tail cycle# should be one less than the head cycle# since
the tail must be on the previous cycle.
And the tail should never be greater than the head.
> xfs_logprint:
> data device: 0x801
> log device: 0x801 daddr: 1468006656 length: 261888
>
> log tail: 60416 head: 65279 state: <DIRTY>
>
>
> LOG REC AT LSN cycle 7 block 60416 (0x7, 0xec00)
> ============================================================================
> TRANS: tid:0xc45c87bc type:DIOSTRAT #items:5 trans:0x0 q:0x808cf70
>
It looks like the tail blk# (60416) is less than the head blk# (65279).
In which case it is likely that the cycle numbers are not the same like
they should be.
--Tim
|