All,
In my testing, I can produce various problems with XFS and RAID1. I'll
reproduce three scenarios below.
First, my test configuration is: Intel Celeron @ 566 MHz, 64 MB RAM, 4
45 GB IDE drives, two of which form a RAID1 (mirror) of about 40 GB (I
don't believe partition size has any bearing on this). While the RAID
array is still resyncing, I build an XFS fs on it, mount it, and run
Bonnie (1.2) "./Bonnie -s 128 -d /share", in an infinite loop. After
some time (usually 5 to 20 minutes), an error results. I posted a patch
for one of those this AM. I don't know if the resync is necessary, but
it sure stresses things!
First, here is an edited typescript file, showing a deadlock situation:
================================================================================
Script started on Wed Feb 21 07:50:08 2001
[danny@dsc_proto_1 danny]$ ps -axlww
F UID PID PPID PRI NI VSZ RSS WCHAN STAT TTY TIME
COMMAND
100 0 1 0 8 0 1120 68 do_sel S ? 0:03
init [3]
040 0 2 1 9 0 0 0 contex SW ? 0:00
[keventd]
040 0 3 1 9 0 0 0 apm_ma SW ? 0:00
[kapm-idled]
040 0 4 1 9 0 0 0 raid1_ DW ? 0:06
[kswapd]
040 0 5 1 9 0 0 0 krecla SW ? 0:00
[kreclaimd]
040 0 6 1 9 0 0 0 raid1_ DW ? 0:09
[bdflush]
040 0 7 1 9 0 0 0 kupdat SW ? 0:00
[kupdate]
040 0 8 1 -1 -20 0 0 md_thr SW< ? 0:00
[mdrecoveryd]
040 0 9 1 -1 -20 0 0 md_thr SW< ? 0:00
[raid1d]
140 0 510 1 8 0 2444 168 do_sel S ? 0:00
smbd -D
140 0 519 1 9 0 2024 656 do_sel S ? 0:00
nmbd -D
040 0 645 1 -1 -20 0 0 md_thr SW< tty1 0:11
[raid1d]
040 0 646 1 19 19 0 0 md_thr SWN tty1 0:09
[raid1syncd]
040 0 703 1 9 0 0 0 pagebu SW tty1 0:00
[pagebuf_daemon]
040 0 704 1 9 0 0 0 raid1_ DW tty1 0:28
[page_daemon]
000 0 723 623 9 0 1648 84 wait4 S tty1 0:00
/bin/sh ./torture
100 0 729 616 10 0 1740 552 read_c S tty2 0:00
-bash
000 0 1238 723 9 0 1340 328 pipe_w S tty1 1:20
Bonnie -s 256 -d /share -y
140 500 1264 510 9 0 3824 1148 raid1_ D ? 0:01
smbd -D
040 0 1265 1238 9 0 1404 60 lock_p D tty1 0:00
Bonnie -s 256 -d /share -y
040 0 1266 1238 9 0 1404 60 mracce D tty1 0:00
Bonnie -s 256 -d /share -y
040 0 1267 1238 9 0 1404 68 mracce D tty1 0:00
Bonnie -s 256 -d /share -y
000 500 1757 1737 14 0 2748 1100 - R pts/0 0:00 ps
-axlww
[danny@dsc_proto_1 danny]$ grep -A2 __alloc_page /var/log/messages.1 |
tail -3
Feb 20 12:51:38 dsc_proto_1 kernel: __alloc_pages: 0-order allocation
failed.
Feb 20 12:51:41 dsc_proto_1 last message repeated 21 times
Feb 20 13:51:52 dsc_proto_1 kernel: md: md1: sync done.
[danny@dsc_proto_1 danny]$ exit
Script done on Wed Feb 21 07:55:22 2001
=============================================================================
Note the Bonnie processes: #1238 is the parent. #s 1265, 1266, and 1267
are the seekers it creates as it's last measurement. #1265 is in
"lock_page", while the others are in "mraccessf". Just before the hang,
the kernel spouted the "0-order alocation failed." messages. Last week,
I changed the __alloc_pages function to panic() after that printk, and
traced the stack back to raid1_alloc_rlbh() in raid1.c. This function
does the Right Thing, checking the pointer for NULL, and if so, calls
"wait_event". Well, this causes a schedule(), while waiting for mem to
be freed, but obviously something was locked when that happened. I'd
suspect something in the XFS code isn't expecting down-stream drivers to
call schedule().
Item next: in page_buf.c, line 1369, the BUG() macro is called if psync
is NULL after the kmalloc() call. I've hit this several times with the
above setup.
Item final: I've seen the messages:
cluster_write: can't get pb (page_buf_io.c:1839)
cluster_write: pb_lookup_pages failed! (page_buf_io.c:1844)
and the process hung. The only out was a reboot, and that filesystem
refused to be unmounted. ps(1) showed the process status as 'D',
uninterruptable sleep :-(.
I'd love to help with this, but I'm at my limit of what I know. For
example, with the BUG() at page_buf.c:1369, can you set up for a
wait_event()? Can you simply return with a status of try_again? I
don't know what might need to be unrolled by this point, so am at a loss
of how to proceed.
Help!
Thanks in advance!
--
"Men occasionally stumble over the truth, but most of them pick
themselves up and hurry off as if nothing had happened."
-- Winston Churchill
Danny
|