On Fri, 2002-10-25 at 06:36, Luben Tuikov wrote:
> This is a dump of the stack for mount after it
> goes in D state. This happens on the 2nd time
> of mount, strangely the 1st time it succeeds.
>
> Trace; c0107e52 <__down+82/d0>
> Trace; c0107fec <__down_failed+8/c>
> Trace; c02b83bc <.text.lock.page_buf+ce/172>
> Trace; c02c0bbe <xfs_bdstrat_cb+3e/50>
> Trace; c02b4c98 <xfs_bwrite+c8/110>
> Trace; c029c034 <xlog_bwrite+64/a0>
> Trace; c029d45c <xlog_write_log_records+18c/1c0>
> Trace; c029d563 <xlog_clear_stale_blocks+d3/160>
> Trace; c029ce65 <xlog_find_tail+285/460>
> Trace; c02a0a07 <xlog_recover+37/100>
> Trace; c02984c4 <xfs_log_mount+b4/100>
> Trace; c02a1fb3 <xfs_mountfs+583/11c0>
> Trace; c02b83bc <.text.lock.page_buf+ce/172>
> Trace; c02a16c9 <xfs_readsb+99/100>
> Trace; c02aaf4e <xfs_mount+20e/2e0>
> Trace; c02c1d92 <linvfs_read_super+162/2c0>
> Trace; c014cf6a <alloc_super+3a/1b0>
> Trace; c014d942 <get_sb_bdev+1b2/2a0>
> Trace; c014cebb <get_fs_type+3b/b0>
> Trace; c014dce4 <do_kern_mount+124/140>
> Trace; c0162b63 <do_add_mount+93/1a0>
> Trace; c0162ea0 <do_mount+160/1d0>
> Trace; c0162ce9 <copy_mount_options+79/d0>
> Trace; c01633af <sys_mount+df/140>
> Trace; c01094ab <system_call+33/38>
>
> The setup is as per BUG 182, except that the
> array is 3 disks and 1 spare, chunk-size 4 and
> has finished syncing. lvm 1.1-rc2 on top.
>
> My question is why is the lock called twice...?
> As I said, this happens only the second time
> mount is attempted. < clue)
These stack dumps are confusing, all the output is doing is
reporting where it found a code address on the stack, it
does not mean it is a currently active stack frame.
You will get a different I/O pattern on the second mount. The first
one is mounting something built by mkfs, the second one is doing
some house keeping on log. That clear_stale_blocks code is there
to ensure we do not have any old partial writes of log data from
a failed previous mount scattered ahead of the log. It is writing
zeros down into this space - this happens on every xfs mount after
the first one.
Your stack appears to have lost a couple of levels of calls,
there is a pagebuf_iorequest in there. This is locking the
pages in the I/O and sending them down to the lower layers.
So xfs_mount does call xfs_readsb which does locking on
the superblock buffer, but it xfs_readsb does not call
xfs_mountfs, xfs_mount calls xfs_mountfs.
In this case we are reusing the same buffer to write out
several chunks of log data. In theory each write is completed
before the second one is started - and we obtain the lock on
the pagebuf before we go and reuse it again. To hang on a
page lock like this, suggests that something went wrong at
I/O completion time for the previous write.
>
> Second: has anyone looked at _pagebuf_page_io()?
>
> Is this fn expected to always unlock the page?
> Maybe this is why we are getting bug 182 when
> using chunk-size > page_size?
>
> That is, shouldn't this locking be like (not real C code):
>
> do_io_page(pg)
> {
> if (locking)
> lock_page(pg);
> do_io(pg);
> if (locking)
> unlock_page(pg);
> }
>
> And in this way this locking will be out of the low-level
> io fn.
>
> Also, what is the purpose of pb_locked? If this is _any_ kind
> of lock, should it be atomic, bitops, or at least volatile?
pb_locked just means the pages are already locked. The reason the
code looks so complex is that it is attempting to map variable
sized units of I/O (pagebufs) onto fixed sized units - pages
and buffer heads. There is one common case in the code where
we are creating a pagebuf for the first time where the locking
of the pages has already been done once and we keep them locked
until we read them in.
>
> I'm seeing this locking scattered all over pagebuf/ and have been
> wondering if it should be pulled out like the example above?
>
Trust me, lots of eyes have spent lots of time in this code. Not
sure what is going on here yet though.
Steve
> Sorry for the dumb questions, but I have only been looking at
> the code for a couple of days and am trying to get this working...
>
> --
> Luben
>
|