xfs
[Top] [All Lists]

Re: file corruption during emacs build on XFS logical volume

To: Linux XFS <linux-xfs@xxxxxxxxxxx>
Subject: Re: file corruption during emacs build on XFS logical volume
From: Sean Neakums <sneakums@xxxxxxxx>
Date: Fri, 04 Jan 2002 18:39:28 +0000
In-reply-to: <6usn9mj7id.fsf@zork.zork.net> (Sean Neakums's message of "Fri, 04 Jan 2002 18:28:10 +0000")
Mail-followup-to: Linux XFS <linux-xfs@xxxxxxxxxxx>
References: <1010141444.1992.3.camel@pyewacket> <1010164935.1945.8.camel@UberGeek> <6u666ikn5n.fsf@zork.zork.net> <6usn9mj7id.fsf@zork.zork.net>
Sender: owner-linux-xfs@xxxxxxxxxxx
User-agent: Gnus/5.090004 (Oort Gnus v0.04) Emacs/21.1 (i386-debian-linux-gnu)
begin  Sean Neakums quotation:

> begin  Sean Neakums quotation:
>
>> begin  Austin Gonyou quotation:
>>
>>> No, It does not guarantee. Also, if you're not on the same Inode the
>>> file time is different, etc, it's possible to have different md5sums. A
>>> basic size and file comparison is probably best for what you want to do.
>>
>> The md5sum of a file is based on its contents only.  The inode has
>> nothing to do with it.
>>
>>> Something you can do to test what I'm talking about is copy each of your
>>> dumps to another name, binfilexyz.1 or something, then compare it's
>>> md5sum against the original. Those should be the only time they match.  
>>
>> I'm not sure exaclty what you mean by this.  The name of the file is
>> irrelevant in the computation of the md5sum.
>
> In Emacs' case, there's a niggle with this: if there's already an
> emacs-21.1.1 file there, it'll dump as emacs-21.1.2, and the dumped
> file will have that version embedded in it.  The script I was using to
> generate the ms5sum was aware of this, and deleted the dumped files
> before each dump, so that the dump always happened to the name
> emacs-21.1.1.  However, there  is a variable, emacs-build-time, which
> contains the exact time the dump occurred.  So md5sums are no use anyway.
>
> Funnily, the last file dump I did worked correctly at the time, but is
> now segfaulting, some hours later.
>
> I was doing the dumps while running three instances of this program,
> which I hacked up in a hurry, to generate lots of dirty pages and
> actual I/O:  http://zork.net/~sneakums/io-test.c
>
> I wonder if the pages that are modified by the dump are not being
> written to disk correctly?  This last dump was in a D state for a few
> minutes while the three io-test instances were running, but it did
> complete and seemingly start up correctly a few minutes after I killed
> them.

I think I can finally reproduce the bogus dumps.  I just did it here,
three times.  What I did was: run the io-test program above on two
different files for approx three minutes, then start the emacs dump.
When the dump completed, I started a third io-test and left the three
of them run for about five or so more minutes.  What I'm trying to do
with all this I/O is to force the dumped binary's pages to be written
to disk.

So after the five minutes or so, I killed the io-test threads and
attempted to start the dumped binary, which failed with a `cannot
execute' message from the shell.  I looked in the file, and it pure
garbage: not even an "ELF" string at the start of the file.

I am now going to do a build of the upstream source and see if I can
make the dump break on that too.  I'm hopeful that it will, as the
unexec code is the same in upstream as in the Debian emacs21 package.

-- 
 /////////////////  |                  | The spark of a pin
<sneakums@xxxxxxxx> |  (require 'gnu)  | dropping, falling feather-like.
 \\\\\\\\\\\\\\\\\  |                  | There is too much noise.


<Prev in Thread] Current Thread [Next in Thread>