On Thu, 2002-01-24 at 13:41, Austin Gonyou wrote:
> I'm having some pretty bad issues with memory being put into cache, and
> then not being released. This is causing me to have lots of networking
> problems because my eepro100 keeps having issues and reporting "no
> resources". When I get on the system locally, I can see that there is a
> large amount of swap in use, and almost all of the 8GB of ram I have is
> in cache.
>
> My DBA's have been setting up an oracle instance on that server, but
> they're running into problems because cached ram isn't being released.
>
> I seem to recall something about this being fixed, but I don't remember
> what kernel version, etc. Could someone lend a suggestion or two? I'd
> like to use 2.4.17, but I'm kind of put-off with all the data corruption
> issues I've seen on the list lately. Is 2.4.17 ok to use with the
> patches on oss.sgi.com/projects/xfs/downloads/patches?
The patches were updated yesterday, hopefully we have shot all the
problems which emerged with 2.4.17 - they were not actually new
problems, but something in 2.4.17 seemed to make them happen more.
Of the top of my head, major things which have been fixed are:
o a page leak which occurred under heavy memory load, someone who had
reported this problem was able to run the dbench load which originally
triggered it for several days non stop afterwards. This has been around
since the new VM in 2.4.10.
o overwriting the start of a partition with file data, this has also been
there for ever, I think it used to oops on the shutdown, something
changed to remove the oops, and we wrote data instead. A few people
who lost filesystems in the last year or so were probably seeing a
variant of this.
o corruption of the emacs binary during the build process - this was a
memory mapped I/O under heavy memory pressure issue. This also has
been present for a long time, it probably just took the right circumstances
for someone to find it.
I cannot speak for people running athlon processors with DRI in the
kernel, looks like there is definitely a memory corruption issue on that
configuration which has the potential for having eaten a couple of
filesystems.
We cannot really offer backports of individual fixes to specific kernel
versions, it is just too much work. Hopefully 2.4.17 is beaten into shape
now and can be used without fear.
Steve
> --
> Austin Gonyou
> Systems Architect, CCNA
> Coremetrics, Inc.
> Phone: 512-698-7250
> email: austin@xxxxxxxxxxxxxxx
>
> "It is the part of a good shepherd to shear his flock, not to skin it."
> Latin Proverb
--
Steve Lord voice: +1-651-683-3511
Principal Engineer, Filesystem Software email: lord@xxxxxxx
|