xfs
[Top] [All Lists]

ADD 801063 - mkfs.xfs after having ext2 mounted on a device can fail

To: nathans@xxxxxxxxxxxx
Subject: ADD 801063 - mkfs.xfs after having ext2 mounted on a device can fail
From: pv@xxxxxxxxxxxxx (nathans@xxxxxxxxxxxx)
Date: Sun, 10 Sep 2000 22:39:16 -0700 (PDT)
Cc: linux-xfs@xxxxxxxxxxx
Reply-to: sgi.bugs.xfs@xxxxxxxxxxxxxxxxx
Sender: owner-linux-xfs@xxxxxxxxxxx
Webexec: webpvupdate,pvincident
Webpv: wobbly.melbourne.sgi.com
View Incident: 
http://co-op.engr.sgi.com/BugWorks/code/bwxquery.cgi?search=Search&wlong=1&view_type=Bug&wi=801063

 Status : open                         Priority : 3                         
 Assigned Engineer : nathans           Submitter : lord                     
*Modified User : nathans              *Modified User Domain : engr          
*Description :
Running mkfs to build an xfs filesystem after a partition has
been mounted as ext2 has periodically failed for me. The failure
is usually this:

[root@lord /]# mkfs -t xfs -f -l size=16000b /dev/sda4
meta-data=/dev/sda4              isize=256    agcount=8, agsize=149104 blks
data     =                       bsize=4096   blocks=1192826, imaxpct=25
         =                       sunit=0      swidth=0 blks, unwritten=0
naming   =version 2              bsize=4096  
log      =internal log           bsize=4096   blocks=16000

.....


==========================
ADDITIONAL INFORMATION (ADD)
From: nathans@engr (BugWorks)
Date: Sep 10 2000 10:39:15PM
==========================

> of the filesystem blocksize. When ext2 is mounted on a device
> it changes the block size to be whatever its mkfs block size 
> was. 4K in my case.

Ah - this is why I couldn't reproduce it before - I was using
the default ext2 block size of 1K.  I can now also reproduce it
every time too using 4K - this is how to do it:

mkfs.ext2 -b 4096 /dev/hda10
mount /dev/hda10 /mnt/tmp
umount /mnt/tmp
mkfs.xfs /dev/hda10

With the old mkfs.xfs, ie. with the EFS SB write, this always fails.
And if I remove the "-b 4096" from the ext2 mkfs line, it always
passes.

It seems to be that the ext2 mount calls set_blocksize(), and
its never reset (even on umount).  So when we come along and
do a 512 byte write, 1K back from the end of the device (where
the end of the device is the value we get back from BLKGETSIZE,
minus one, we end up issuing a 4K IO to the device driver which
goes beyond the end of the device.

Is that how you understand things, Steve?

Going back and redoing the calculations for your sda4, Steve, and
my hda10, neither of these devices have sizes which are 4K aligned,
which is consistent with this theory.

Unfortunately, as you say in your solution #1, there seems to
be no way to get the current blocksize for a block device, and
if thats true, then its difficult to guard against this in
mkfs.xfs (could be bigger than 4K too, right?).  And we ideally
want to use as much of the device as possible - the amount of
the device mkfs can access shouldn't depend on the state that a
previous occupant has left the device in.

I think what we're really after is a way to call set_blocksize from
userspace, so that we can call that before any mkfs writes and know
where we stand for an arbitrary device (from a look through the kernel
most filesystems don't reset the device blocksize on unmount).
Is this possible?  dunno - I can't see any ioctl doing this.

> 
> 1. repeat the same calculation in libxfs to reduce the size of
> the partition we tell mkfs about - problem is there is no call
> to get this block size value out of the kernel.
> 
> 2. reset the block size to the hardware sector size - the only way
> I can see to do this is to use the raw device interface.
> ...
> 2 would be best if we could figure out how to do it.

Yup, I agree.


<quick diversion>
I noticed xfs seems to set the device blocksize to 512 explicitly
(in linvfs_readsuper) ... that seems odd to me - why aren't we
driving this either from the superblock blocksize value or to
the value from get_hardblocksize()? - does that code look right
to you?

The xfs put_super code also looks like it should be calling
get_hardblocksize() rather than accessing hardsect_size[] directly,
perhaps?

Having spent all of 10 minutes looking at this code, I could well
be way off on these above two statements though ;-)
</end diversion>


OK, so given we have removed the last-potential-efs-sb overwrite in
mkfs.xfs, the remaining question is whether writing the external log
device (i.e. zeroing & footering it) is affected as well.

This experiment seems to suggest that we are not affected here
either... (hda10 and hda11 both have sizes which are _not_ multiples
of 4K):

[nathans@troppo mkfs]# mkfs -t ext2 -b 4096 -q /dev/hda11
mke2fs 1.18, 11-Nov-1999 for EXT2 FS 0.5b, 95/08/09
[nathans@troppo mkfs]# mkfs -t ext2 -b 4096 -q /dev/hda10
mke2fs 1.18, 11-Nov-1999 for EXT2 FS 0.5b, 95/08/09
[nathans@troppo mkfs]# mount /dev/hda11 /mnt/tmp
[nathans@troppo mkfs]# umount /mnt/tmp
[nathans@troppo mkfs]# mount /dev/hda10 /mnt/tmp
[nathans@troppo mkfs]# umount /mnt/tmp

...ext2 has now called set_blocksize(4096) on both devices.

[nathans@troppo mkfs]# ./mkfs.xfs -f -q /dev/hda10 
mkfs.xfs: write failed: No space left on device
[nathans@troppo mkfs]# ./mkfs.xfs -f -q /dev/hda11
mkfs.xfs: write failed: No space left on device

...as expected, these both fail with the old mkfs binary.

[nathans@troppo mkfs]# mkfs -t ext2 -b 1024 -q /dev/hda10
mke2fs 1.18, 11-Nov-1999 for EXT2 FS 0.5b, 95/08/09
[nathans@troppo mkfs]# mkfs -t ext2 -b 4096 -q /dev/hda11
mke2fs 1.18, 11-Nov-1999 for EXT2 FS 0.5b, 95/08/09
[nathans@troppo mkfs]# mount /dev/hda11 /mnt/tmp
[nathans@troppo mkfs]# umount /mnt/tmp
[nathans@troppo mkfs]# mount /dev/hda10 /mnt/tmp
[nathans@troppo mkfs]# umount /mnt/tmp

we can now expect to succeed on hda10, but not hda11...

[nathans@troppo mkfs]# ./mkfs.xfs -f -q -l logdev=/dev/hda11 /dev/hda10
[nathans@troppo mkfs]# ./mkfs.xfs -f -q /dev/hda11
mkfs.xfs: write failed: No space left on device
[nathans@troppo mkfs]# ./mkfs.xfs -f -q /dev/hda10
[nathans@troppo mkfs]# 

... eureka - no error for the external log case!

For bonus points, someone gets to explain why this is... cos I
thought it would fail.  Anyway, my head hurts ;) - I'll have to
figure this one out tomorrow.

Maximum blocksize which can be set in the kernel is PAGE_SIZE,
so there may be issues at 8192 too, for some architectures ...
mkfs.ext2 wont take us there though (disallows 8192 as a block
size).

cheers.

<Prev in Thread] Current Thread [Next in Thread>