Sorry for the silence on this end - got dragged into a conference call on
a totally different topic, sat and listened for over an hour.
Anyway, looks like you are willing to place the blame elsewhere,
I have not seen similar, but my NFS usage between linux boxes is not
huge. You mentioned switching to user space nfsd - the kernel version
should be OK from the XFS point of view now we have the speed problem
worked out - I still plan on going back and cleaning up what I did
there, sounds like there may be unmount problems later.
Not sure I can help with the NFS problem itself though.
Steve
> On 09 Mar 2001 11:45:51 -0500, Vladimir Vukicevic wrote:
> >
> >
> > Hmm. I now have a repeatable case of this I/O error, but as the other
> > end is running the kernel nfsd server, I'm not exactly sure how to debug
> > it or turn on debugging on the other end... A 'cat foo.ogg /dev/null' on
> > the mounted partition repeatably gives an I/O error.
> >
> > Any thoughts on how to diagnose this? I'll keep poking..
>
>
> Doh. Forgot about tcpdump. :-) So, this is what I'm seeing:
>
> 11:57:46.159530 rain.ximian.priv.4040915765 > ogg.nfs: 140 lookup fh
> Unknown/1 "01-letters_from_the_wasteland.ogg" (DF)
> 11:57:46.163988 ogg.nfs > rain.ximian.priv.4040915765: reply ok 128
> lookup fh Unknown/1 (DF)
>
> So, the lookup goes okay. Then the weirdness starts.
>
> 11:57:46.173585 rain.ximian.priv.4057692981 > ogg.nfs: 112 read fh
> Unknown/1 4096 bytes @ 2408448 (DF)
> 11:57:46.217986 ogg > rain.ximian.priv: (frag 20288:1480@1480+)
> 11:57:46.220115 ogg.nfs > rain.ximian.priv.4057692981: reply ok 1472
> read (frag 20288:1480@0+)
>
> Looking at this more in ethereal, this call/reply sequence has XID
> 0xf0db7b35. It appears to succeed (although I'm confused why it's
> reading @ offset 2408448). However, the next 3 calls have the exact same
> XID (marked as dup's in ethereal), and it's reading same size/offset.
>
> 11:57:46.865276 rain.ximian.priv.4057692981 > ogg.nfs: 112 read fh
> Unknown/1 4096 bytes @ 2408448 (DF)
> 11:57:46.907584 ogg > rain.ximian.priv: (frag 20544:1480@1480+)
> 11:57:46.909674 ogg.nfs > rain.ximian.priv.4057692981: reply ok 1472
> read (frag 20544:1480@0+)
>
> 11:57:48.265276 rain.ximian.priv.4057692981 > ogg.nfs: 112 read fh
> Unknown/1 4096 bytes @ 2408448 (DF)
> 11:57:48.319183 ogg > rain.ximian.priv: (frag 20800:1480@1480+)
> 11:57:48.321456 ogg.nfs > rain.ximian.priv.4057692981: reply ok 1472
> read (frag 20800:1480@0+)
>
> 11:57:51.065342 rain.ximian.priv.4057692981 > ogg.nfs: 112 read fh
> Unknown/1 4096 bytes @ 2408448 (DF)
> 11:57:51.114002 ogg > rain.ximian.priv: (frag 21056:1480@1480+)
> 11:57:51.115317 ogg.nfs > rain.ximian.priv.4057692981: reply ok 1472
> read (frag 21056:1480@0+)
>
> Then, it switches to a new XID, and the same set repeats itself.
>
> 11:57:56.666020 rain.ximian.priv.4074470197 > ogg.nfs: 112 read fh
> Unknown/1 4096 bytes @ 2408448 (DF)
> 11:57:56.720428 ogg > rain.ximian.priv: (frag 21312:1480@1480+)
> 11:57:56.722696 ogg.nfs > rain.ximian.priv.4074470197: reply ok 1472
> read (frag 21312:1480@0+)
>
> 11:57:57.365349 rain.ximian.priv.4074470197 > ogg.nfs: 112 read fh
> Unknown/1 4096 bytes @ 2408448 (DF)
> 11:57:57.411418 ogg > rain.ximian.priv: (frag 21568:1480@1480+)
> 11:57:57.413422 ogg.nfs > rain.ximian.priv.4074470197: reply ok 1472
> read (frag 21568:1480@0+)
>
> 11:57:58.765279 rain.ximian.priv.4074470197 > ogg.nfs: 112 read fh
> Unknown/1 4096 bytes @ 2408448 (DF)
> 11:57:58.814856 ogg > rain.ximian.priv: (frag 21824:1480@1480+)
> 11:57:58.817140 ogg.nfs > rain.ximian.priv.4074470197: reply ok 1472
> read (frag 21824:1480@0+)
>
> 11:58:01.565281 rain.ximian.priv.4074470197 > ogg.nfs: 112 read fh
> Unknown/1 4096 bytes @ 2408448 (DF)
> 11:58:01.617302 ogg > rain.ximian.priv: (frag 22080:1480@1480+)
> 11:58:01.619220 ogg.nfs > rain.ximian.priv.4074470197: reply ok 1472
> read (frag 22080:1480@0+)
>
>
> ... and then cp dies with an I/O error.
>
> So, from looking at this, I'm going to blame the client side NFS stuff
> here -- especially since the file is perfectly fine on the server
> itself. Alan Cox said that he wasn't aware of any nfs client-side
> patches that have gone in since 2.4.2 came out.
>
> Note that this is actually a similar error to what I was seeing while
> running my iPAQ with a NFS'd root filesystem -- certain files just give
> I/O errors. This is one of them; I recreated by copying the file to an
> ext2 filesystem and getting the same I/O error.
>
> So, this isn't XFS related (whew!), but nfs on linux does indeed suck.
> :-P
>
>
> - Vlad
>
|