| To: | linux-xfs@xxxxxxxxxxx |
|---|---|
| Subject: | Need some help with cause of Oops in XFS 1.2 |
| From: | Steven Dake <sdake@xxxxxxxxxx> |
| Date: | Tue, 25 Mar 2003 11:23:45 -0700 |
| Sender: | linux-xfs-bounce@xxxxxxxxxxx |
| User-agent: | Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021130 |
XFS Developers, Ok here is what I have done: I have a tree that already had XFS 1.1 (with a bunch of other stuff) included. I hand applied the XFS 1.2 on top of XFS 1.1 patch and everything seems to work fine, except when I run bonnie++ and under load (Writing intelligently operation), I receive the following Oops: Mar 24 15:20:13 192 kernel: invalid operand: 0000 Mar 24 15:20:13 192 kernel: CPU: 0 Mar 24 15:20:13 192 kernel: EIP: 0010:[<c0128331>] Not tainted Mar 24 15:20:13 192 kernel: EFLAGS: 00010286 Mar 24 15:20:13 192 kernel: eax: 00000037 ebx: 000000f8 ecx: 00000008 edx: 00000000 Mar 24 15:20:13 192 kernel: esi: c30000bc edi: 00000000 ebp: c2a517c0 esp: f7465e80 Mar 24 15:20:13 192 kernel: ds: 0018 es: 0018 ss: 0018 Mar 24 15:20:13 192 kernel: Process bonnie++ (pid: 114, stackpage=f7465000) Mar 24 15:20:13 192 kernel: Stack: c0327060 000000f8 00018541 00000000 f748ee80 c012abf6 f7465ec0 00001000 Mar 24 15:20:13 192 kernel: 00000000 00001000 00001000 00001000 1650f000 00000000 f740e320 f740e3d4 Mar 24 15:20:13 192 kernel: 00000000 3e7f849d 000e6b32 3e7f849d 3852bb50 0640d230 f7465f64 1650e000 Mar 24 15:20:13 192 kernel: Call Trace: [<c012abf6>] [<c024ad64>] [<c02467a6>] [<c0135836>] [<c0116f9b>] Mar 24 15:20:13 192 kernel: [<c0106d03>] Mar 24 15:20:13 192 kernel: Mar 24 15:20:13 192 kernel: Code: 0f 0b 83 c4 0c 8d 46 04 39 46 04 74 12 5b 89 f0 31 c9 ba 03 Mar 24 15:20:13 192 kernel: invalid operand: 0000 Mar 24 15:20:13 192 kernel: CPU: 0 Mar 24 15:20:13 192 kernel: EIP: 0010:[<c0128331>] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 Mar 24 15:20:13 192 kernel: EFLAGS: 00010286 Mar 24 15:20:13 192 kernel: eax: 00000037 ebx: 000000f8 ecx: 00000008 edx: 00000000 Mar 24 15:20:13 192 kernel: esi: c30000bc edi: 00000000 ebp: c2a517c0 esp: f7465e80 Mar 24 15:20:13 192 kernel: ds: 0018 es: 0018 ss: 0018 Mar 24 15:20:13 192 kernel: Process bonnie++ (pid: 114, stackpage=f7465000) Mar 24 15:20:13 192 kernel: Stack: c0327060 000000f8 00018541 00000000 f748ee80 c012abf6 f7465ec0 00001000 Mar 24 15:20:13 192 kernel: 00000000 00001000 00001000 00001000 1650f000 00000000 f740e320 f740e3d4 Mar 24 15:20:13 192 kernel: 00000000 3e7f849d 000e6b32 3e7f849d 3852bb50 0640d230 f7465f64 1650e000 Mar 24 15:20:13 192 kernel: Call Trace: [<c012abf6>] [<c024ad64>] [<c02467a6>] [<c0135836>] [<c0116f9b>] Mar 24 15:20:13 192 kernel: [<c0106d03>] Mar 24 15:20:13 192 kernel: Code: 0f 0b 83 c4 0c 8d 46 04 39 46 04 74 12 5b 89 f0 31 c9 ba 03 >>EIP; c0128331 <unlock_page+61/90> <===== Trace; c012abf6 <generic_file_write_nolock+556/720> Trace; c024ad64 <xfs_write+384/580> Trace; c02467a6 <linvfs_write+e6/120> Trace; c0135836 <sys_write+96/f0> Trace; c0116f9b <sys_gettimeofday+1b/90> Trace; c0106d03 <system_call+33/38> Code; c0128331 <unlock_page+61/90> 00000000 <_EIP>: Code; c0128331 <unlock_page+61/90> <===== 0: 0f 0b ud2a <===== Code; c0128333 <unlock_page+63/90> 2: 83 c4 0c add $0xc,%esp Code; c0128336 <unlock_page+66/90> 5: 8d 46 04 lea 0x4(%esi),%eax Code; c0128339 <unlock_page+69/90> 8: 39 46 04 cmp %eax,0x4(%esi) Code; c012833c <unlock_page+6c/90> b: 74 12 je 1f <_EIP+0x1f> c0128350 <unlock_page+80/90> Code; c012833e <unlock_page+6e/90> d: 5b pop %ebx Code; c012833f <unlock_page+6f/90> e: 89 f0 mov %esi,%eax Code; c0128341 <unlock_page+71/90> 10: 31 c9 xor %ecx,%ecx Code; c0128343 <unlock_page+73/90> 12: ba 03 00 00 00 mov $0x3,%edx I tracked down the oops to a BUG() call in unlock_page which is triggered when a page that is already unlocked is attempted to be unlocked again (this just wouldn't work which is why there is a bug...). I read through the code in generic_file_write_nolock and it looks to me as if the page is being locked, and then unlocked consistently. If I put syslog() calls in the bonnie++ output to write the chunk count, (slowing down the I/Os) there is no oops. Bonnie++ consistently Oops on the 45703 write. I checked every function in the Oops call path and all the functions are identical to taking linux 2.4.19 and applying XFS 1.2 except for 2.4.19 isms. I though perhaps some changes to the mm layer in 2.4.19 would be the cause, so I modified my 2.4.18 to match the 2.4.19 implementation + xfs core patch with same results. I am running on Linux 2.4.18 with significant modifications (although most of the memory manager and filesystem layer are the same as 2.4.18). I am running UP on a custom Pentium4 board/processor (well tested before this to be operational). I have run the same bonnie++ benchmark on other filesystems without incident. I was thinking I could take the xfs_write routine from 1.1 and meld it into the 1.2 source codes, but this looks to be very painful, and hey, I want 1.2 anyway ;) Anyone know why I am receiving this Oops? Thanks -steve |
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | Maximum log stripe size, Stéphane Doyon |
|---|---|
| Next by Date: | Rebuilding the src rpm for xfs 1.2 redhat kernel., Austin Gonyou |
| Previous by Thread: | Maximum log stripe size, Stéphane Doyon |
| Next by Thread: | Re: Need some help with cause of Oops in XFS 1.2, Christoph Hellwig |
| Indexes: | [Date] [Thread] [Top] [All Lists] |