Andrew Morton <akpm@xxxxxxxxx> writes:
> Zlatko Calusic wrote:
>>
>> Oh, yes, you're completely right, of course. It's the pinned part of
>> page cache that makes big pressure on the memory. Whole lot of
>> inactive page cache pages (~700MB in my case) is not really good
>> indicator of recyclable memory, when (probably) a big part of it is
>> pinned and can't be thrown out. So it is VM, after all.
>
> Does xfs_dump actually pin 700 megs of memory??
I don't know, it's just a possibility.
>
> If someone could provide a detailed description of what xfs_dump
> is actually doing internally, that may help me shed some light.
> xfs_dump is actually using kernel support for coherency reasons,
> is that not so? How does it work?
>
> Does the machine have highmem?
No, LOWMEM only, I have attached relevant files in the end of this
message.
> What was the backtrace into the page allocation failure?
Chris, unfortunately here I have exactly the same problems with the
code from CVS pulled today, so the bug is not really solved. I pressed
Alt-Sysrq-T after the xfsdump got stuck and got this:
xfsdump D C0440380 16 609 486 (NOTLB)
Call Trace:
[blk_run_queues+135/152] blk_run_queues+0x87/0x98
[io_schedule+41/56] io_schedule+0x29/0x38
[__lock_page+141/176] __lock_page+0x8d/0xb0
[autoremove_wake_function+0/60] autoremove_wake_function+0x0/0x3c
[autoremove_wake_function+0/60] autoremove_wake_function+0x0/0x3c
[find_lock_page+84/180] find_lock_page+0x54/0xb4
[find_or_create_page+24/168] find_or_create_page+0x18/0xa8
[_pagebuf_lookup_pages+381/916] _pagebuf_lookup_pages+0x17d/0x394
[pagebuf_get+138/256] pagebuf_get+0x8a/0x100
[xfs_trans_read_buf+53/808] xfs_trans_read_buf+0x35/0x328
[xfs_itobp+241/420] xfs_itobp+0xf1/0x1a4
[pagebuf_rele+169/192] pagebuf_rele+0xa9/0xc0
[xfs_bulkstat+2123/2824] xfs_bulkstat+0x84b/0xb08
[huft_build+1349/1852] huft_build+0x545/0x73c
[xfs_ioc_bulkstat+271/352] xfs_ioc_bulkstat+0x10f/0x160
[xfs_bulkstat_one+0/1196] xfs_bulkstat_one+0x0/0x4ac
[xfs_ioctl+589/1312] xfs_ioctl+0x24d/0x520
[huft_build+1349/1852] huft_build+0x545/0x73c
[xfs_inactive_free_eofblocks+176/548] xfs_inactive_free_eofblocks+0xb0/0x224
[xfs_release+133/196] xfs_release+0x85/0xc4
[dput+27/348] dput+0x1b/0x15c
[__fput+197/236] __fput+0xc5/0xec
[linvfs_ioctl+34/76] linvfs_ioctl+0x22/0x4c
[huft_build+1349/1852] huft_build+0x545/0x73c
[huft_build+1349/1852] huft_build+0x545/0x73c
[sys_ioctl+559/627] sys_ioctl+0x22f/0x273
[huft_build+1349/1852] huft_build+0x545/0x73c
[syscall_call+7/11] syscall_call+0x7/0xb
[huft_build+1349/1852] huft_build+0x545/0x73c
I have kdb compiled in the kernel but I forgot how to use (drop into)
it. Pressing Pause key, like documentation says, doesn't work.
Anyway, I think the problem is not connected to inodes/dentries or
anything related to slab allocator, slabs are recycled quite fast
after xfsdump starts dumping to a file. Also there's really lots of
memory in inactive state, so it's strange. To prove all that, I'm
sending the output of few interesting files, taken when xfsdump
stopped doing it job (and kernel spit allocation errors).
/proc/meminfo:
MemTotal: 772788 kB
MemFree: 3984 kB
MemShared: 0 kB
Buffers: 39600 kB
Cached: 585368 kB
SwapCached: 0 kB
Active: 167644 kB
Inactive: 545868 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 772788 kB
LowFree: 3984 kB
SwapTotal: 2097136 kB
SwapFree: 2097136 kB
Dirty: 488 kB
Writeback: 0 kB
Mapped: 129308 kB
Slab: 45808 kB
Committed_AS: 139228 kB
PageTables: 944 kB
ReverseMaps: 49432
/proc/buddyinfo
Node 0, zone DMA 42 4 1 1 1 0 1 0
0 0 0
Node 0, zone Normal 0 1 16 67 1 1 1 1
0 0 0
/proc/slabinfo
slabinfo - version: 1.2
unix_sock 132 144 416 16 16 1 : 120 60
ip_conntrack 0 0 320 0 0 1 : 120 60
tcp_tw_bucket 0 0 96 0 0 1 : 248 124
tcp_bind_bucket 16 113 32 1 1 1 : 248 124
tcp_open_request 0 0 64 0 0 1 : 248 124
inet_peer_cache 0 0 64 0 0 1 : 248 124
secpath_cache 0 0 32 0 0 1 : 248 124
flow_cache 0 0 64 0 0 1 : 248 124
xfrm4_dst_cache 0 0 192 0 0 1 : 248 124
ip_fib_hash 6 113 32 1 1 1 : 248 124
ip_dst_cache 1 20 192 1 1 1 : 248 124
arp_cache 1 30 128 1 1 1 : 248 124
raw4_sock 0 0 448 0 0 1 : 120 60
udp_sock 1 9 448 1 1 1 : 120 60
tcp_sock 24 28 896 7 7 1 : 120 60
sgpool-MAX_PHYS_SEGMENTS 32 32 2048 16 16 1 : 54 27
sgpool-64 32 32 1024 8 8 1 : 120 60
sgpool-32 32 32 512 4 4 1 : 120 60
sgpool-16 32 45 256 3 3 1 : 248 124
sgpool-8 32 60 128 2 2 1 : 248 124
xfs_chashlist 3526 8080 16 40 40 1 : 248 124
xfs_ili 702 2604 140 93 93 1 : 248 124
xfs_ifork 0 0 56 0 0 1 : 248 124
xfs_efi_item 0 15 260 0 1 1 : 120 60
xfs_efd_item 0 15 260 0 1 1 : 120 60
xfs_buf_item 26 26 148 1 1 1 : 248 124
xfs_dabuf 10 202 16 1 1 1 : 248 124
xfs_da_state 0 11 336 0 1 1 : 120 60
xfs_trans 26 26 592 2 2 2 : 120 60
xfs_inode 26206 29640 392 2964 2964 1 : 120 60
xfs_btree_cur 18 29 132 1 1 1 : 248 124
xfs_bmap_free_item 0 253 12 0 1 1 : 248 124
page_buf_t 118 120 256 8 8 1 : 248 124
linvfs_icache 17157 26719 352 2429 2429 1 : 120 60
ntfs_big_inode_cache 669 2800 480 350 350 1 : 120 60
ntfs_inode_cache 2 20 192 1 1 1 : 248 124
ntfs_name_cache 0 0 512 0 0 1 : 120 60
ntfs_attr_ctx_cache 0 0 32 0 0 1 : 248 124
isofs_inode_cache 0 0 320 0 0 1 : 120 60
fat_inode_cache 18 165 352 15 15 1 : 120 60
eventpoll 0 0 96 0 0 1 : 248 124
kioctx 0 0 192 0 0 1 : 248 124
kiocb 0 0 160 0 0 1 : 248 124
dnotify_cache 16 169 20 1 1 1 : 248 124
file_lock_cache 12 40 96 1 1 1 : 248 124
fasync_cache 1 202 16 1 1 1 : 248 124
shmem_inode_cache 22 27 416 3 3 1 : 120 60
uid_cache 4 113 32 1 1 1 : 248 124
deadline_drq 2304 2373 32 21 21 1 : 248 124
blkdev_requests 2048 2064 160 86 86 1 : 248 124
biovec-BIO_MAX_PAGES 256 260 3072 52 52 4 : 54 27
biovec-128 256 260 1536 52 52 2 : 54 27
biovec-64 256 260 768 52 52 1 : 120 60
biovec-16 268 280 192 14 14 1 : 248 124
biovec-4 280 295 64 5 5 1 : 248 124
biovec-1 274 606 16 3 3 1 : 248 124
bio 295 472 64 8 8 1 : 248 124
sock_inode_cache 170 198 352 18 18 1 : 120 60
skbuff_head_cache 161 408 160 17 17 1 : 248 124
sock 4 11 352 1 1 1 : 120 60
proc_inode_cache 221 828 320 69 69 1 : 120 60
sigqueue 142 145 132 5 5 1 : 248 124
radix_tree_node 10702 13702 288 1054 1054 1 : 120 60
cdev_cache 16 118 64 2 2 1 : 248 124
bdev_cache 14 40 96 1 1 1 : 248 124
mnt_cache 24 59 64 1 1 1 : 248 124
inode_cache 420 420 320 35 35 1 : 120 60
dentry_cache 16274 40740 128 1358 1358 1 : 248 124
filp 1462 1470 128 49 49 1 : 248 124
names_cache 4 4 4096 4 4 1 : 54 27
buffer_head 66334 85956 48 1102 1102 1 : 248 124
mm_struct 69 70 384 7 7 1 : 120 60
vm_area_struct 2980 3080 96 77 77 1 : 248 124
fs_cache 68 177 64 3 3 1 : 248 124
files_cache 68 72 416 8 8 1 : 120 60
signal_act 66 66 1344 22 22 1 : 54 27
task_struct 110 110 1568 22 22 2 : 54 27
pte_chain 8607 14125 32 125 125 1 : 248 124
size-131072(DMA) 0 0 131072 0 0 32 : 8 4
size-131072 0 0 131072 0 0 32 : 8 4
size-65536(DMA) 0 0 65536 0 0 16 : 8 4
size-65536 0 0 65536 0 0 16 : 8 4
size-32768(DMA) 0 0 32768 0 0 8 : 8 4
size-32768 16 16 32768 16 16 8 : 8 4
size-16384(DMA) 0 0 16384 0 0 4 : 8 4
size-16384 2 4 16384 2 4 4 : 8 4
size-8192(DMA) 0 0 8192 0 0 2 : 8 4
size-8192 17 17 8192 17 17 2 : 8 4
size-4096(DMA) 0 0 4096 0 0 1 : 54 27
size-4096 370 378 4096 370 378 1 : 54 27
size-2048(DMA) 0 0 2048 0 0 1 : 54 27
size-2048 42 60 2048 22 30 1 : 54 27
size-1024(DMA) 0 0 1024 0 0 1 : 120 60
size-1024 252 252 1024 63 63 1 : 120 60
size-512(DMA) 0 0 512 0 0 1 : 120 60
size-512 204 264 512 32 33 1 : 120 60
size-256(DMA) 0 0 256 0 0 1 : 248 124
size-256 168 180 256 12 12 1 : 248 124
size-192(DMA) 0 0 192 0 0 1 : 248 124
size-192 892 1000 192 50 50 1 : 248 124
size-128(DMA) 0 0 128 0 0 1 : 248 124
size-128 1131 1470 128 49 49 1 : 248 124
size-96(DMA) 0 0 96 0 0 1 : 248 124
size-96 2492 3160 96 79 79 1 : 248 124
size-64(DMA) 0 0 64 0 0 1 : 248 124
size-64 2794 4720 64 80 80 1 : 248 124
size-32(DMA) 0 0 32 0 0 1 : 248 124
size-32 1499 1921 32 17 17 1 : 248 124
kmem_cache 128 128 120 4 4 1 : 248 124
--
Zlatko
|