xfs
[Top] [All Lists]

Re: TAKE - make "osyncisdsync" the default

To: Andrew Morton <akpm@xxxxxxxxxx>
Subject: Re: TAKE - make "osyncisdsync" the default
From: Stephen Lord <lord@xxxxxxx>
Date: Fri, 15 Feb 2002 17:35:38 -0600
Cc: Eric Sandeen <sandeen@xxxxxxx>, "linux-xfs@xxxxxxxxxxx" <linux-xfs@xxxxxxxxxxx>
References: <200202152250.g1FMokw05049@xxxxxxxxxxxxxxxxxxxxxx> <3C6D9686.5444468A@xxxxxxxxxx>
Sender: owner-linux-xfs@xxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.7) Gecko/20011226
Andrew Morton wrote:

Eric Sandeen wrote:

"osyncisdsync" makes us go faster, and it's how other Linux
filesystems are set up, anyway - so make it the default.


The comment lies :)

       /* For now, when the user asks for O_SYNC, we'll actually
        * provide O_DSYNC. */
       if (status >= 0) {
               if ((file->f_flags & O_SYNC) || IS_SYNC(inode))
                       status = generic_osync_inode(inode, 
OSYNC_METADATA|OSYNC_DATA);
       }

On other filesystems, O_SYNC writes actually sync both metadata
and data, as well as the inode, if i_size changed.

For default behaviour I suggest the best semantics are for
an O_SYNC write to guarantee that the written data will be
available after a crash.


Ah, but we do not go there:

       /* Handle various SYNC-type writes */
       if (ioflags & O_SYNC) {

               /* Flush all inode data buffers */

               error = -fsync_inode_buffers(ip);
               if (error)
                       goto out;

               /*
* If we're treating this as O_DSYNC and we have not updated the
                * size, force the log.
                */

               if (!(mp->m_flags & XFS_MOUNT_OSYNCISOSYNC)
                       && !(xip->i_update_size)) {
                       /*
                        * If an allocation transaction occurred
                        * without extending the size, then we have to force
                        * the log up the proper point to ensure that the
                        * allocation is permanent.  We can't count on
                        * the fact that buffered writes lock out direct I/O
                        * writes - the direct I/O write could have extended
                        * the size nontransactionally, then finished before
* we started. xfs_write_file will think that the file
                        * didn't grow but the update isn't safe unless the
                        * size change is logged.
                        *
                        * Force the log if we've committed a transaction
                        * against the inode or if someone else has and
                        * the commit record hasn't gone to disk (e.g.
                        * the inode is pinned).  This guarantees that
                        * all changes affecting the inode are permanent
                        * when we return.
                        */

                       xfs_inode_log_item_t *iip;
                       xfs_lsn_t lsn;

                       iip = xip->i_itemp;
                       if (iip && iip->ili_last_lsn) {
                               lsn = iip->ili_last_lsn;
                               xfs_log_force(mp, lsn,
XFS_LOG_FORCE | XFS_LOG_SYNC);
                       } else if (xfs_ipincount(xip) > 0) {
                               xfs_log_force(mp, (xfs_lsn_t)0,
XFS_LOG_FORCE | XFS_LOG_SYNC);
                       }

               } else {
                       /*
* O_SYNC or O_DSYNC _with_ a size update are handled
                        * the same way.
                        *
                        * If the write was synchronous then we need to make
* sure that the inode modification time is permanent.
                        * We'll have updated the timestamp above, so here
* we use a synchronous transaction to log the inode.
                        * It's not fast, but it's necessary.
                        *
                        * If this a dsync write and the size got changed
                        * non-transactionally, then we need to ensure that
                        * the size change gets logged in a synchronous
                        * transaction.
                        */

                       tp = xfs_trans_alloc(mp, XFS_TRANS_WRITE_SYNC);
                       if ((error = xfs_trans_reserve(tp, 0,
XFS_SWRITE_LOG_RES(mp),
                                                     0, 0, 0))) {
                               /* Transaction reserve failed */
                               xfs_trans_cancel(tp, 0);
                       } else {
                               /* Transaction reserve successful */
                               xfs_ilock(xip, XFS_ILOCK_EXCL);
                               xfs_trans_ijoin(tp, xip, XFS_ILOCK_EXCL);
                               xfs_trans_ihold(tp, xip);
                               xfs_trans_log_inode(tp, xip, XFS_ILOG_CORE);
                               xfs_trans_set_sync(tp);
error = xfs_trans_commit(tp, 0, (xfs_lsn_t)0);
                               xfs_iunlock(xip, XFS_ILOCK_EXCL);
                       }
               }



<Prev in Thread] Current Thread [Next in Thread>