Since switching to 2419 and the xfs patch that was available shortly
thereafter, I've noticed a few things.
1) when I unmount an xfs file system, there's a lot of disk activity --
an unmount on a 15kRPM SCSI/80Mb interface of a smallish partition,
say 5-10G, can take > 30 seconds. Now this is *after* I've done
a 'sync' and/or after the machine has been sitting idle for a fair
amount
of time -- so there shouldn't be virtually any dirty buffers that need
to
be unloaded (I would hope).
2) I'm running low on memory. I'm not used to running low on memory
with
a 1G-mem machine that isn't even running a desktop. 'ps' would tell me
much and neither would vmstat, but /proc/slabinfo was instructive,
specifically:
xfs_acl 0 0 304 0 0 1 : 124 62
xfs_chashlist 14858 17170 16 85 85 1 : 252 126
xfs_ili 7084 7084 136 253 253 1 : 252 126
xfs_ifork 605 670 56 10 10 1 : 252 126
xfs_efi_item 15 15 260 1 1 1 : 124 62
xfs_efd_item 15 15 260 1 1 1 : 124 62
xfs_buf_item 156 156 148 6 6 1 : 252 126
xfs_dabuf 202 202 16 1 1 1 : 252 126
xfs_da_state 0 0 340 0 0 1 : 124 62
xfs_trans 161 161 584 23 23 1 : 124 62
xfs_inode 431990 431990 400 43199 43199 1 : 124 62
xfs_btree_cur 58 58 132 2 2 1 : 252 126
xfs_bmap_free_item 252 253 12 1 1 1 : 252 126
page_buf_t 234 360 192 16 18 1 : 252 126
linvfs_icache 431886 431886 512 61698 61698 1 : 124 62
Now I wasn't able to find much on the fields of slab and was too lazy to
read source code to figure out the entries, but the first column looks
an
awful lot like the amount of memory that is 'missing'. The figures from
xfs_inode and linvfs_icache_icache seem darn close to identical in size
(is that a coincidence?) but added together also would take up 80+% of
my
physical memory.
If it was just a cache that got released when there was a need for
memory,
that would be one thing, but I just was forced to reboot yesturday
(about
12 days of uptime) and already I'm using 9M of swapspace. When I looked
at this
a few days before the crash, I was using, oh, maybe, 30-40M of swap.
Normally
I don't use any swap.
Today I can see the linux file cache using around 400M, -- that wasn't
the
case a few days ago -- none of the memory figures amounted to a hill of
beans.
Early Saturday morning one of the XFS partitions stopped responding. I
tried unmounting it-- nada (process hung). I tried 'sync' (also hung).
Went for reboot -f (that was dumb, it also does a sync -- that hung).
Went for reboot -n (still not what I wanted at 2 in the
morning)...brought
system to a halt and then it died trying to unmount file systems --
couln't
unmount them. I just hung there all night (I went to bed after I issued
the reboot figuring I'd wakeup to a happy system again).
Hit reset and things came back up, seemingly ok.
So I'm wondering -- almost feels like there could be a memory leak
somewhere?
Maybe I'm reporting this and it's already been fixed?
Anyway -- just wanted to report 'symptoms' so if others come up with
similar symptoms might develop a pattern. It's not like it is an
emergency -- Maybe I can tell it to auto reboot once a week...more of
an annoyance more than anything.
-linda
|