Bug 410 - Kernel (2.6.11) deadlock in user mode when writing data through mmap on large files (64-bit systems on xfs)
: Kernel (2.6.11) deadlock in user mode when writing data through mmap on larg...
Status: RESOLVED FIXED
: XFS
XFS kernel code
: unspecified
: PC Linux
: P2 major
: ---
Assigned To:
:
: http://groups.google.com/groups?selm=...
:
:
:
:
  Show dependency treegraph
 
Reported: 2005-05-10 09:52 CST by
Modified: 2006-04-09 17:55 CST (History)


Attachments
The test program (1.09 KB, text/x-csrc)
2005-05-10 09:58 CST, Xavier Roche
Details
The hang logs (SysRq output) (63.74 KB, text/plain)
2005-05-10 09:59 CST, Xavier Roche
Details


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2005-05-10 09:52:34 CST
[ Bug previously reported at <http://groups.google.com/groups?selm=42guS-7u6-25%
40gated-at.bofh.it&output=gplain> as ext3/xfs specific, but it is actually a 
XFS-specific problem ]

We are experiencing deadlocks while writing data through mmap in a context 
where memory is highly stressed, on an XFS filesystem.

If you launch the program below with a size far greater than your RAM size 
(let's say 10 times), chances are that you will reproduce this deadlock. The 
deadlock only occurs with XFS (it never occurs on EXT3 filesystems, for example)

This deadlock was reproduced several times on various machines, running Linux 
Kernel 2.6.11, on an EMT64 system (64-bit).

$ uname -s -r -m -p -i -o
Linux 2.6.11-gentoo-r6 x86_64 Intel(R) Xeon(TM) CPU 3.20GHz GenuineIntel 
GNU/Linux


Crash test:
========================================================================
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <stdio.h>
#include <string.h>

#define MEGA (1024 * 1024)

/**
  * Usage : fname size (Mo)
  **/
int main(int argc, char* argv[]) {
   int fd0 = open(argv[1], O_RDWR | O_CREAT, S_IREAD | S_IWUSR);
   size_t size0 = ((size_t) atoi(argv[2])) * MEGA;
   char* ptr0;

   if (ftruncate(fd0, size0) != 0) {
     printf("error truncating file");
     return 1;
   }

   ptr0 = mmap(NULL,
       size0,
       (PROT_READ | PROT_WRITE),
       (MAP_FILE | MAP_SHARED),
       fd0,
       0);
   {
     size_t i;
     for(i = 0; i < size0; i += MEGA) {
       (void) memset(ptr0 + i, 0, MEGA);
       if (i % ((100 * MEGA)) == 0) {
 int m = (int) (i / MEGA);
 int t = (int) (size0 / MEGA);
 printf("Creating : %d/%d\n", m, t);
       }
     }
   }

   if (msync((void*) ptr0, size0, MS_SYNC) != 0) {
     printf("error syncing file");
     return 1;
   }

   if (munmap(ptr0, size0) != 0) {
     printf("error closing file");
     return 1;
   }

   return 0;
}
========================================================================


Below, the SysRq output (For more lisibility, in the Show State section, 
I moved on top the output of the incriminated program (ngtest))
========================================================================

telnet> send break
SysRq : Changing Loglevel
Loglevel set to 8
XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)
XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)
[...]
XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)
XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)

telnet> send break
XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)
XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)
SysRq : Show State




                                                       sibling
  task                 PC          pid father child younger older

ngtest        D ffff810078303468     0 20514  20505 
(NOTLB)
ffff810078302eb8 0000000000000086 00000001013820a7 0000000000000008
       0000000000000086 0000006e7c5cfe88 ffffffff805915c0 ffff81007fe095a0
       000000000000006e ffff810002c1a780
Call Trace:<ffffffff80139d89>{__mod_timer+318} 
<ffffffff803782c2>{schedule_timeout+165}
       <ffffffff8013a8d0>{process_timeout+0} 
<ffffffff803781ff>{io_schedule_timeout+49}
       <ffffffff802b94cf>{blk_congestion_wait+151} 
<ffffffff80145f95>{autoremove_wake_function+0}
       <ffffffff80145f95>{autoremove_wake_function+0} 
<ffffffff8022982e>{kmem_alloc+210}
       <ffffffff802298cd>{kmem_realloc+43} 
<ffffffff8020d181>{xfs_iext_realloc+245}
       <ffffffff801e62a8>{xfs_bmap_insert_exlist+53} 
<ffffffff801e7194>{xfs_bmap_add_extent_hole_real+949}
       <ffffffff801e84df>{xfs_bmap_add_extent+4759} 
<ffffffff8021f92e>{xfs_trans_brelse+84}
       <ffffffff8021faa8>{xfs_trans_log_buf+107} 
<ffffffff801dbb04>{xfs_alloc_ag_vextent+4013}
       <ffffffff8022995c>{kmem_zone_alloc+70} 
<ffffffff801f215f>{xfs_btree_init_cursor+72}
       <ffffffff801eb7b7>{xfs_bmapi+6221} 
<ffffffff801e8be8>{xfs_bmap_do_search_extents+778}
       <ffffffff80210443>{xfs_iomap_write_direct+646} 
<ffffffff803786c7>{__down_write+129}
       <ffffffff80145f95>{autoremove_wake_function+0} 
<ffffffff8020ff2d>{xfs_iomap+569}
       <ffffffff802ba0ad>{submit_bio+221} 
<ffffffff80229b83>{xfs_map_blocks+66}
       <ffffffff8022a844>{xfs_page_state_convert+996} 
<ffffffff8017281c>{__set_page_dirty_buffers+191}
       <ffffffff801653c5>{page_referenced_file+209} 
<ffffffff80173958>{alloc_buffer_head+50}
       <ffffffff80173fa5>{alloc_page_buffers+99} 
<ffffffff8022af95>{linvfs_writepage+179}
       <ffffffff8015a457>{shrink_zone+3000} 
<ffffffff8022ab3d>{__linvfs_get_block+136}
       <ffffffff80241fc2>{__memset+50} 
<ffffffff80192ed3>{do_mpage_readpage+949}
       <ffffffff80157148>{do_drain+0} <ffffffff8012b801>{try_to_wake_up+755}
       <ffffffff8023e937>{radix_tree_node_alloc+19} 
<ffffffff8023eb29>{radix_tree_insert+291}
       <ffffffff8015aa0c>{try_to_free_pages+278} 
<ffffffff801534d8>{__alloc_pages+531}
       <ffffffff8015575e>{__do_page_cache_readahead+215} 
<ffffffff8014fe07>{filemap_nopage+347}
       <ffffffff8015f511>{do_no_page+984} 
<ffffffff8015f8e4>{handle_mm_fault+419}
       <ffffffff80378992>{_spin_unlock_irqrestore+5} 
<ffffffff8011d090>{do_page_fault+1185}
       <ffffffff8010dd9d>{error_exit+0}
------- Comment #1 From 2005-05-10 09:58:34 CST -------
Created an attachment (id=156) [details]
The test program

This programm hangs the kernel on an xfs filesystem
------- Comment #2 From 2005-05-10 09:59:21 CST -------
Created an attachment (id=157) [details]
The hang logs (SysRq output)

SysRq output after the kernel hang
------- Comment #3 From 2006-02-14 07:44:54 CST -------
2.6.15.4 with a lot of memory free (actually all of it is cache, but this 
should not be a problem) and a single program stressing the disk with lots of 
read/writes, on a 32 bit system returns:  
 
XFS: possible memory allocation deadlock in kmem_alloc (mode:0x2d0) 
 
in random times. Apart from this warning nothing seems to go wrong.  
 
------- Comment #4 From 2006-02-14 22:54:00 CST -------
The problem only occurs on 64-bit systems, with a lots of RAM (typically more
than 8GB).
------- Comment #5 From 2006-04-09 03:37:18 CST -------
I've experienced this problem also:
Apr  8 20:13:12 archon XFS: possible memory allocation deadlock in kmem_alloc
(mode:0x250)
Apr  8 20:13:14 archon XFS: possible memory allocation deadlock in kmem_alloc
(mode:0x250)
Apr  8 20:13:16 archon XFS: possible memory allocation deadlock in kmem_alloc
(mode:0x250)
Apr  8 20:13:18 archon XFS: possible memory allocation deadlock in kmem_alloc
(mode:0x250)
Apr  8 20:13:20 archon XFS: possible memory allocation deadlock in kmem_alloc
(mode:0x250)

and so on in my syslog... system was stalled until I rebooted.
# uname -a 
Linux archon 2.6.16 #3 SMP Mon Mar 27 19:03:27 CEST 2006 x86_64 AMD Athlon(tm)
64 X2 Dual Core Processor 4400+ AuthenticAMD GNU/Linux

with 2GB of RAM
the system was under heavy stress (tar -cvjf over about 200GB of data, thousands
of directories and files)
------- Comment #6 From 2006-04-09 15:55:34 CST -------
Please try the current 2.6.17 tree, it has the incore extents management
code completely reworked to address this issue.

cheers.