|Subject:||Fwd: Sudden File System Corruption|
|From:||Mike Dacre <mike.dacre@xxxxxxxxx>|
|Date:||Thu, 5 Dec 2013 07:58:06 -0800|
|Dkim-signature:||v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=JHBv5YBIeJR0YNsnBerCX6rDWKt8Git74zuI+DM/f+s=; b=ZFCCJOZme5pn4+yOe7dNCLZlkqG4cPe7wwMTBwfrTeLyZCo+ucCieOHv3mw8U/9W/f 7LQdOVXl5KXoHyQkDeMUkBPHbePS/RpeMnUFxvcomxxiaGyJLpO2CNj8dLGeCxZXw+pB eFsF6uWWHxIFl5r64zrsCRmX5HN5wexi/ueQT902ovYrqw3W70yyZQBGX5qspn3zpiug r4NQA19ewOG17lUfb2MQ+Sg200uaoEk49b6anZpIntN743S/x7YpCjFtvzSM3AAcCobA 4vitMIb2omTvk9MNzOw0J+G2oskjOQee/mQMow+mK2x4vgXi8/wWrscywGINIhGTBfkT y/pQ==|
|References:||<CAPd9ww_qT9J_Rt04g7+OApoBeggNOyWNwD+57DiDTuUvz-O-0g@xxxxxxxxxxxxxx> <52A03513.6030408@xxxxxxxxxxxxxxxxx> <CAPd9ww9hsOFK6pxqRY-YtLLAkkJHCuSi1BaM4n9=2XTjNVAn2Q@xxxxxxxxxxxxxx>|
On Thu, Dec 5, 2013 at 12:10 AM, Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx> wrote:
On 12/4/2013 8:55 PM, Mike Dacre wrote:
You are right, sorry. Â9260-4i
1. ÂLevel -- confirm RAID6
2. ÂStrip size? Â(eg 512KB)
3. ÂStripe size? (eg 7168KB, 14*256)
Not sure how to get thisÂ
4. ÂBBU module?
Yes. iBBU, state optimal, 97% charged.Â
5. ÂIs write cache enabled?
Yes: Cahced IO and Write Back with BBU are enabled.
I have also attached an adapter summary (megaraid_adp_info.txt) and a virtual and physical drive summary (megaraid_drive_info.txt).Â
What is the XFS geometry?
meta-data ="" Â Â Â Â Âisize=256 Â Âagcount=26, agsize=268435455 blks
Â Â Â Â Â Â Â Â= Â Â Â Â Â Â Â Â Â Â Â Â sectsz=512 Â attr=2
data Â Â Â Â = Â Â Â Â Â Â Â Â Â Â Â Â bsize=4096 Â blocks=6835404288, imaxpct=5
Â Â Â Â Â Â Â Â= Â Â Â Â Â Â Â Â Â Â Â Â sunit=0 Â Â Âswidth=0 blks
naming Â Â=version 2 Â Â Â Â Â Âbsize=4096 Â ascii-ci=0
log Â Â Â Â Â=internal Â Â Â Â Â Â Â bsize=4096 Â blocks=521728, version=2
Â Â Â Â Â Â Â = Â Â Â Â Â Â Â Â Â Â Â Â Âsectsz=512 Â sunit=0 blks, lazy-count=1
realtime Â =none Â Â Â Â Â Â Â Â Â extsz=4096 Â blocks=0, rtextents=0
This is also attached as xfs_info.txtÂ
Good point. ÂThese happened while trying to ls. ÂI am not sure why I can't find them in the log, they printed out to the console as 'Input/Output' errors, simply stating that the ls command failed.
That is possible, workloads can get really high sometimes. ÂI am not sure how to control that without significantly impacting performance - I want a single user to be able to use 98% IO capacity sometimes... but other times I want the load to be split amongst many users. ÂAlso, each user can execute jobs simultaneously on 23 different computers, each acessing the same drive via NFS. ÂThis is a great system most of the time, but sometimes the workloads on the drive get really high.Â
Wow, this is huge, I can't believe I missed that. ÂI have switched it to noop now as we use write caching. ÂI have been trying to figure out for a while why I would keep getting timeouts when the NFS load was high. ÂIf you have any other suggestions for how I can improve performance, I would greatly appreciate it.
This one simple command line may help pretty dramatically, immediately,
Great, thanks. ÂOur workloads vary considerably as we are a biology research lab, sometimes we do lots of seeks, other times we are almost maxing out read or write speed with massively parallel processes all accessing the disk at the same time.
|<Prev in Thread]||Current Thread||[Next in Thread>|
|Previous by Date:||tr: FW: TR: Notification250.000Eurnotifierâ, Monsieur Pascal|
|Next by Date:||[PATCH 3/5] xfs: use xfs_ilock_map_shared in xfs_qm_dqiterate, Christoph Hellwig|
|Previous by Thread:||Re: Sudden File System Corruption, Stan Hoeppner|
|Next by Thread:||Re: Fwd: Sudden File System Corruption, Stan Hoeppner|
|Indexes:||[Date] [Thread] [Top] [All Lists]|