mkfs.xfs pagefault when removed storage during operation
Ajeet Yadav
ajeet.yadav.77 at gmail.com
Tue Feb 1 05:06:29 CST 2011
We are testing mkfs.xfs and xfs_repair stability to look for crashes
and other issues specially with removable devices.
And unfortunately crashes does occur.
Code inspection shows in most cases the caller does not handle
libxfs_readbuf() for error cases i.e when return value = NULL.
Now I need your suggestion.
We should fix all such cases or the simplest way is to exit... if
read() or write() fails with EIO errorno in libxfs_readbufr() and
libxfs_writebufr().
Fortunately these function already support exit, if we use flag
LIBXFS_EXIT_ON_FAILURE, LIBXFS_B_EXIT but they are used selectively.
The current problem is related to function libxfs_trans_read_buf()
bp = libxfs_readbuf(dev, blkno, len, flags);
#ifdef XACT_DEBUG
fprintf(stderr, "trans_read_buf buffer %p, transaction %p\n", bp, tp);
#endif
xfs_buf_item_init(bp, tp->t_mountp);
bip = XFS_BUF_FSPRIVATE(bp, xfs_buf_log_item_t *);
bip->bli_recur = 0;
xfs_trans_add_item(tp, (xfs_log_item_t *)bip);
/* initialise b_fsprivate2 so we can find it incore */
XFS_BUF_SET_FSPRIVATE2(bp, tp);
*bpp = bp;
return 0;
if libxfs_readbuf() fails due to device removal or other error, bp = NULL.
In function xfs_buf_item_init(bp, tp->t_mountp) as soon as bp is
dereferenced occurs
mkfs.xfs: unhandled page fault (11) at 0x00000070, code 0x017
More information about the xfs
mailing list