|To:||Dave Chinner <david@xxxxxxxxxxxxx>|
|Subject:||Re: XFS filesystem on EC2 instance corrupts and shuts down|
|From:||Shrinath M <shrinath.m@xxxxxxxxxx>|
|Date:||Thu, 14 Mar 2013 06:58:19 +0530|
|Cc:||Eric Sandeen <sandeen@xxxxxxxxxxx>, Sabyasachi Ruj <sabyasachi.ruj@xxxxxxxxxx>, Vivek Goel <vivek.goel@xxxxxxxxxx>, Supratik Goswami <supratik.goswami@xxxxxxxxxx>, Ric Wheeler <rwheeler@xxxxxxxxxx>, xfs@xxxxxxxxxxx|
|Dkim-signature:||v=1; a=rsa-sha256; c=relaxed/relaxed; d=webyog.com; s=google; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=l/W9A/oJBxeiAOJtK0UgK0Yg34xWopaRiA0YqSy5zQA=; b=MQ2ObfC+L9Bpau/NYcJWkSHoQ06LMxNXrgh2GszSBa1yZZYt0TMrchr//vyxG41/18 Yp+ZSpOSyN5PcwLezCfzkIeuWhjHD3n10z0um2/qbqv1fLujECGJ/mD2wBaA+BiEoQ1O h2Ru/E6xyi1AOKj2esi+LVjbkdsHzqhKVN6u8=|
|References:||<CAOdS1h=7X4O1O7X8YOwxtLm7G=fc+J+6hJxJ1RKbDmfTZXTpeg@xxxxxxxxxxxxxx> <51373DB8.2020707@xxxxxxxxxx> <CAOdS1hnXGj9puaHxeToqmpK40A-3WvJnM7=5HckpyyZYqZTvEQ@xxxxxxxxxxxxxx> <51373FC1.6010101@xxxxxxxxxx> <CAOurMUeasru6ekDYcvVR1QnaWVJFV+-coZsUG5SgG6LnENBvXg@xxxxxxxxxxxxxx> <513751F2.2060109@xxxxxxxxxx> <CAOdS1hngSuHn_HiremLyUS7Qd9eZ68=8arfBuHnEpwXQaBw9Wg@xxxxxxxxxxxxxx> <5140CBE3.80705@xxxxxxxxxxx> <20130313234213.GW21651@dastard>|
Thanks Ben, Dave and Eric.Â
>>but I am wondering if there might be more information before this which is not in your trimmed logs.No, this was the first entry every time we have it in /var/log/messages. dmesg also holds the same. After reboot, it simply fixes without anyone doing anything.
The Linux we are running is definitely amazon baked one, looks like this -Â
$~: uname -a Linux ip-100-0-100-1 3.2.34-55.46.amzn1.x86_64 #1 SMP Tue Nov 20 10:06:15 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
Â- dmesg shows something like this after repairing/rebooting -Â
[ Â Â8.414176] SGI XFS with ACLs, security attributes, realtime, large block/inode numbers, no debug enabled
[ Â Â8.415342] SGI XFS Quota Management subsystem
[ Â Â8.417664] XFS (md0): Mounting Filesystem
[ Â Â8.771553] XFS (md0): Starting recovery (logdev: internal)
[ Â Â9.977325] XFS (md0): Ending recovery (logdev: internal)
Check the first line there, it says no debug enabled. How good/bad is this debug mode in production environments? We are not getting any corruption in our local/test environments, in production, we are getting it once on every third day.
You say unlinked inode list, but if that, it should have an entry in /var/log/messages, right?
Anyway, how can we create this situation? By forcing multiple processes to write/delete files from small disk? Since we are still unaware of what is causing this issue, reproducing it in local/production environment is just shooting in dark... :(
Does turning up the error level affect the data in any way? Or is it *just* detailed good logging while being sensitive to all small errors?
Really appreciate the support that you devs are giving which really is the job of AWS support... I so wish they had some helpful and knowledgeable people in support.
On Thu, Mar 14, 2013 at 5:12 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
|<Prev in Thread]||Current Thread||[Next in Thread>|
|Previous by Date:||Re: xfs_fsr, sunit, and swidth, Stan Hoeppner|
|Next by Date:||Giao dich chung khoan khong gioi han chi mat 150.000 dong/thang, Dich Vu Khach Hang|
|Previous by Thread:||Re: XFS filesystem on EC2 instance corrupts and shuts down, Dave Chinner|
|Next by Thread:||Re: XFS filesystem on EC2 instance corrupts and shuts down, Stan Hoeppner|
|Indexes:||[Date] [Thread] [Top] [All Lists]|