Federico Sevilla III <jijo@xxxxxxxxxxx> writes:
[...]
> "/usr/share/texmf/fonts/tfm/cg/times/",
>> O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 3
>> fstat64(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
>> fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
>> brk(0x8059000) = 0x8059000
>> getdents64(0x3, 0x8056848, 0x1000, 0
>> ^-- end of output
>
> I'm glad you were able to get stuff like this. I wasn't. Does the XFS team
> have
> any suggestions for standard debugging tasks to do when things go wrong
> (without
> a kernel oops or anything like that, but these hung processes) to see what's
> goofing up and help you help us all?
>
It could be interresting to look at /proc/<process id>/, to get a overview of
what
the process was doing. The fd-directory shows all file-descriptors, cwd current
working directory, etc. My next kernel will have kdb, magick sysrq and
xfs-debugging ;)
>> [1]
>> $ ps aux |grep D
>> USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
>> [..not State D..]
>> root 6172 0.0 0.1 1464 496 ? D May30 0:04 find / -xdev
>> ( -false ) -prune -o ( -type f -perm +06000 -o ( ( -type b -o -type c ) -a
>> -not ( -false ) ) ) -printf %8i %5m %3n %-10u %-10g %9s %t %h/%f?n
>> [...]
>> Cron probably started several similar find-processes before without any
>> problems.
>
> Yes. You can't just reproduce them, or at least from my experience. But when
> the
> "creeping death" came, things just started getting stuck. Now it's procedure
> for
> me to do "ps ax" and check for state D stuff that stay in state D for too
> long.
>
What I meant was that `this was the first State D-process, but not the first
time that particular directory was getdents'ed[0], because cron runs find /
every morning.' This problem is reproducable - all processes doing that syscall
gets stuck in State D.
[0] getdents(2) - get directory entries
getdents reads several dirent structures from the direc
tory pointed at by fd into the memory area pointed to by
dirp.
|