On Tue, Apr 09, 2013 at 05:48:52PM -0500, Ben Myers wrote:
> On Tue, Apr 09, 2013 at 05:18:27PM -0500, Eric Sandeen wrote:
> > On 4/9/13 5:16 PM, Michael L. Semon wrote:
> > > A meager non-expert user question with full ignorance of glusterfs:
> > > Why are you having I/O errors once every two weeks?
> > It's runtime errors or corruption, followed by fs shutdown, which then
> > results in IO errors, because all IOs are rejected on the shutdown FS.
> > But that's not always immediately obvious from the stream of resulting
> > "I/O Error" messages ;)
> The IO errors are maybe a bit excessive and scary.
That's entirely the point. If we stay silent we get complaints about
not telling people that there's something wrong. If the error
messages are not excessive and scary, then people don't report them
and so we never hear about problems that are occurring.
> I can understand why some
> people might misinterpret those messages and assume it's a hardware problem.
Quite frankly, the biggest problem we have *always* had is that
people don't bother to read their log files when something has gone
wrong or selectively quote the logs when reporting the bug. This
is the primary reason for the "how to report a bug" FAQ entry asking
for the *full logs* to be posted in a bug report.
Removing error messages because they are "noisy" is not the answer.
Verbose error messages (especially corruption reports) are there
mainly for the benefit of the developers, not the user. The user
needs to know when a corruption has occurred, but we need to
understand the what, how and why of the issue.
It's far better to scare users by dumping all the relevant info into
the log when an error occurs than to be sitting around scratching
our heads going "WTF?" like we are right now because there isn't
enough information in the logs to have even a basic clue of what is