----- Original Message -----
> From: "Joe Landman" <joe.landman@xxxxxxxxx>
> Ok. I've had power supplies take down memory in the past. You might be
> hitting a bad memory cell courtesy of the PS.
Possibly, though see below.
> >> Do you have EDAC (or mcelog) on? Any errors from this?
> >
> > I don't have mcelog on, and no, the memory isn't registered, but a
> > 4-pass run of Memtest+ came up clean, so I'm speculating that the
>
> Not registered (which is just buffered), but ECC. ECC does a parity
> computation on some number of bits, and provides you a rough "good/bad"
> binary state of a particular area of memory. If the parity bits stored
> don't match what is computed on read, then odds are that something is
> wrong. Its not foolproof, but its a good mechanism to catch potential
> errors.
Sure. In my experience, all ECC is registered/buffered, and no non-ECC
is, so I use it as shorthand. No possible chance this northbridge would
do ECC, no. :-)
> We've had cases where Memtest(*) reported everything fine, yet I was
> able to generate ECC errors in a few minutes by running a memory
> intensive app. Memtest does do some hardware exercise, but its not
> usually hitting memory the way apps do. That difference can be
> significant. This is in part why the day job stopped using memtest for
> testing a number of years ago. We now run heavy duty electronic
> structure codes, and pi/e/... computations for burn in.
Fair point. I did also run the non-+ version of Memtest, which I
understand uses a different algorithm, and a couple other things
I found on the UBCD, so I'm *relatively* confident I don't have a
running RAM problem, though as you say, not 100%.
> > *continuing* problem isn't hardware; I'm pretty sure it was just the
> > failing 12V rail on the dying PS. I just have to clean up after it
> > enough to get *one* of these 2 drives cleaned off, then I can make a
> > new FS, and play musical files.
>
> Ahhh ...
>
> I was running a Plex server on an old machine for a while. I had to
> shift over to a beefier box with ECC ram and more CPUs. Right now my
> Plex server has 8 cpus, 24 GB RAM, and about 1TB of disk (old). Once
> you start doing recoding on the fly (multi-resolution output), you
> need the ram and processor power.
>
> >
> > Or, I may just go grab a 3TB external after all. :-)
>
> If you do that, and you still hit the error, chances are you might
> need to swap out your MB and CPU/RAM to something newer (not to mention the
> PS). I'd recommend ECC based systems if at all possible. Xfs can and
> will get very unhappy if bits are flipped on its data structures while
> you are making changes to the file system.
As it happens, Dave helped me clean up a mess 4 or 5 years ago, where
a *wire opened up* on the PATA cable, and all my data structures had
a missing bit. Ghod was that a mess.
We did end up getting the drive. So assuming I can reliably read the
big drive (I have a 3T, a 2T, and a 1T all with different problems),
I'm going to move all the files from it to the new 3T I just bought,
and then play musical files down the chain one at a time.
Thank ghod the new season hasn't started yet. ;-)
Thanks for the help, Joe.
Oh, and the script that Stan was so worried about? It's all
rm and mv commands. 5859 of them.
Cheers,
-- jra
--
Jay R. Ashworth Baylink jra@xxxxxxxxxxx
Designer The Things I Think RFC 2100
Ashworth & Associates http://baylink.pitas.com 2000 Land Rover DII
St Petersburg FL USA #natog +1 727 647 1274
|