At Wed, 26 Apr 100 00:43:16 -0700 (PDT),
Ananth Ananthanarayanan <ananth@xxxxxxxxxxxxxxxxxxxx> wrote:
>
>
>
> Here are my findings in an effort to track down the corruption.
>
> I backed up all the way to a tree as of 3/30/2000.6:00.
> You can do that sort of thing with '-t' option to p_tupdate.
> This is the time I know kernel compilation was working fine
> when PAGEBUF_META was off.
>
> However, today I tried 3/30/2000 with PAGEBUF_META turned on,
> and the corruption showed up. With the config turned off as
> I tried about 4 weeks back, the corruption went away.
>
> Switching to my tot workarea with page-cleaner stuff compiled in,
> but run-time turned off, and with PAGEBUF_META off,
> I can now compile the kernel again.
>
> So there is a strong correlation between PAGEBUF_META and corruption.
> My own hunch is that either the 'block no' calculation in
> pagebuf is wrong, or the 'rele/hold problem' is doing an I/O
> when it shouldn't be.
>
> Suggestion #1 is to ensure that the kernel compiles cleanly
> in several tries with a tot kernel and PAGEBUF_META off.
>
> I'm likely going back to working on delalloc tomorrow,
>
> ananth.
Hmm very odd.
So if the block number was wrong on a meta data write, thus
possibly dropping meta data into file data, the data in
the file should be meta data... which it doesn't seem to be.
A hold/rele shouldn't be dropping data in a file either, corrupting
the meta data aspect of the file system maybe?
Hmm very strange...
BTW the a 2 thread version of doio ran all night!
I also ran 5 compiles at the same time on a different file system
(same system) they all ran to completion.
I also haven't seem any pb_hold count to low messages.
-Russell
|