On 2/1/11 5:06 AM, Ajeet Yadav wrote:
> We are testing mkfs.xfs and xfs_repair stability to look for crashes
> and other issues specially with removable devices.
> And unfortunately crashes does occur.
> Code inspection shows in most cases the caller does not handle
> libxfs_readbuf() for error cases i.e when return value = NULL.
>
> Now I need your suggestion.
> We should fix all such cases or the simplest way is to exit... if
> read() or write() fails with EIO errorno in libxfs_readbufr() and
> libxfs_writebufr().
I see very little reason to gracefully handle all error cases
during mkfs. It would be prettier, yes, but if mkfs fails, with
or without an error, with or without a segfault, you have to
just start it over anyway, right?
I think there are better places to focus effort.
-Eric
> Fortunately these function already support exit, if we use flag
> LIBXFS_EXIT_ON_FAILURE, LIBXFS_B_EXIT but they are used selectively.
>
> The current problem is related to function libxfs_trans_read_buf()
>
> bp = libxfs_readbuf(dev, blkno, len, flags);
> #ifdef XACT_DEBUG
> fprintf(stderr, "trans_read_buf buffer %p, transaction %p\n", bp, tp);
> #endif
> xfs_buf_item_init(bp, tp->t_mountp);
> bip = XFS_BUF_FSPRIVATE(bp, xfs_buf_log_item_t *);
> bip->bli_recur = 0;
> xfs_trans_add_item(tp, (xfs_log_item_t *)bip);
>
> /* initialise b_fsprivate2 so we can find it incore */
> XFS_BUF_SET_FSPRIVATE2(bp, tp);
> *bpp = bp;
> return 0;
>
> if libxfs_readbuf() fails due to device removal or other error, bp = NULL.
> In function xfs_buf_item_init(bp, tp->t_mountp) as soon as bp is
> dereferenced occurs
>
> mkfs.xfs: unhandled page fault (11) at 0x00000070, code 0x017
>
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs
>
|