Jesse Stroik wrote:
> Eric,
>
> Eric Sandeen wrote:
>> Jesse Stroik wrote:
>>> I have a server with a ~20TB xfs file system on Linux
>>> (2.6.18-92.1.22.el5) and am running xfsprogs-2.9.4-4.el5. We had a few
>>> corrupted files which I believe were due to a SCSI issue after a recent
>>> power outage. Due to the corruption, I ran xfs_check and would like to
>>> run xfs_repair on the system.
>> It'd really be great to test more recent xfsprogs first, that one is
>> about 2 years old.
>>
>> You can probably grab any recent fedora src.rpm and rebuild it, and
>> later go back to the centos version if you wish.
>
>
> I fetched the current version from SVN using these directions:
> http://xfs.org/index.php/Getting_the_latest_source_code
>
> I get identical results.
>
> --------
> ...
> reset bad sb for ag 31
> reset bad agf for ag 31
> reset bad agi for ag 31
> Segmentation fault
Ok, from a metadump image Jesse provided (thanks!) it's dying in here:
bno = be32_to_cpu(agfl->agfl_bno[i]);
printf("agfl at %p i is %d agfl_bno[i] %u bno is %u\n",
agfl, i, agfl->agfl_bno[i], bno);
if (verify_agbno(mp, be32_to_cpu(agf->agf_seqno), bno))
set_agbno_state(mp, be32_to_cpu(agf->agf_seqno),
bno, XR_E_FREE);
agfl_bno looks corrupt, and bno is coming out to be huge.
set_agbno_state() does:
*(ba_bmap[(agno)] + (ag_blockno)/XR_BB_NUM) = ....
where ag_blockno is that bno above; this wanders us off into bad memory
and boom. I'll see what we can do to fix it up.
-Eric
|