On 2/27/13, Eric Sandeen <sandeen@xxxxxxxxxxx> wrote:
> On 2/27/13 10:50 PM, Eric Sandeen wrote:
>> On 2/27/13 10:38 PM, Eric Sandeen wrote:
>>
>> ...
>>
>>> re-cc'ing xfs list
>>>
>>> So I used pahole to look at all structs, objdump -d to disassemble,
>>> and md5sum'd the results to see what's different.
>>>
>>> pi@raspberrypi ~ $ md5sum cross/*.dis cross/*.pahole native/*.dis
>>> native/*.pahole
>>>
>>> <manual sort>
>>>
>>> c0abd80c3bf049db5e1909fd851261cc cross/xfs-O1-g.ko.pahole
>>> c0abd80c3bf049db5e1909fd851261cc cross/xfs-O2-g.ko.pahole
>>> c0abd80c3bf049db5e1909fd851261cc cross/xfs-Os-g.ko.pahole
>>> c0abd80c3bf049db5e1909fd851261cc native/xfs-O1-g.ko.pahole
>>> c0abd80c3bf049db5e1909fd851261cc native/xfs-O2-g.ko.pahole
>>> c0abd80c3bf049db5e1909fd851261cc native/xfs-Os-g.ko.pahole
>>>
>>> so all structures look identical, good - but:
>>>
>>> while disassembly of these two modules match:
>>>
>>> d76f6ebf4d8a1b9f786facefbcf16f69 cross/xfs-O1-g.ko.dis
>>> d76f6ebf4d8a1b9f786facefbcf16f69 native/xfs-O1-g.ko.dis
>>>
>>> do you see the problem w/ the cross-compiled xfs-O1-g.ko as well?
No, I didn't. The problem has only shown itself on the -O2 builds,
both native and cross-compiled. Lower optimization levels don't show
any of the symptoms.
Perhaps a better comparison would be-O2 builds among working and
non-working compilers? You'd asked for these before, but I just
finished them today. The modules, build logs, and fs/xfs/ build trees
are up at
<http://www.splack.org/~jason/projects/xfs-arm-corruption/3.6.11-g89caf39/>
A quick rundown:
-cross-gcc4.4: OK
-cross-gcc4.5: OK
-cross-gcc4.6: BAD
-cross-gcc4.7: BAD
-cross-gcc4.8: OK
Some of these don't seem to want to rmmod after they've been inserted.
Argh reboots.
>>> the others differ:
>>>
>>> 349f3490a49f2ce539c2b058914f64f0 native/xfs-Os-g.ko.dis
>>> 91c8e8230774808b538c21a83106a5d7 cross/xfs-Os-g.ko.dis
>>>
>>> 649338e1b8eeed6a294504fc76a39cb0 native/xfs-O2-g.ko.dis
>>> e52c2a48277326c313bba76aa0b33ab7 cross/xfs-O2-g.ko.dis
>>>
>>> The diff of the disassembly of the others is huge, hard to
>>> know where to start just yet. Need an objdump mode that only
>>> shows function-relative addresses or something to cut down
>>> on the noise.
>>
>> Could you try the same, to isolate the differences: objdump -d
>> all of the *.o files for, say, the -O2 build, md5sum & compare,
>> and see which ones differ?
Er, uh... oops! :-) I'd scrubbed the objects between each test, so
each module had to be regenerated. So, the intermediate objects won't
match the various xfs-O2-g.ko's you've already downloaded. Look in
the -cross-gcc4.7 and -native-gcc4.7 subdirectories for new copies.
# pwd
/xfsdebug/tracetest/3.6.11-g89caf39/xfs-modules-native-gcc4.7/xfs-O2-g-obj
# for obj in *.o; do
if [ "$(objdump -d $obj | md5sum)" != "$(cd
../../xfs-modules-cross-gcc4.7/xfs-O2-g-obj/ && objdump -d $obj |
md5sum)" ]; then
echo "obj $obj is different"; fi; done
obj xfs.o is different
obj xfs_attr_leaf.o is different
obj xfs_bmap.o is different
obj xfs_dir2_block.o is different
obj xfs_itable.o is different
obj xfs_log.o is different
obj xfs_log_recover.o is different
> And one more test. Every time you hit the error, it causes
> a log replay on the next mount since the fs has shut down.
>
> Can you try
>
> # mount; umount; mount; test
>
> so that you start the test from a clean mount, and see if you still hit it?
>
> Maybe save that image off before you do that test just in case it changes
> the state.
I'm not sure on that. Even in read-write mode, the notice in my
kernel log has always been "Corruption detected. Unmount and run
xfs_repair". It's never been a forced filesystem shutdown, just a
stern warning and half-accessible files. The next mount always seems
to be clean.
[89574.079876] XFS (loop0): Corruption detected. Unmount and run xfs_repair
[89587.269316] XFS (loop0): Mounting Filesystem
[89587.444629] XFS (loop0): Ending clean mount
I usually mount read-only and it doesn't seem like the image's md5sum
doesn't change between runs. I made a copy then mounted it read-write
a time or two. The md5sum changed between mounts. However, I am
still seeing the error when attempting to read the directory. The
mounted-rw-checked image is up at
<http://www.splack.org/~jason/projects/xfs-arm-corruption/journalreplaytest/>
Jason
|