pcp
[Top] [All Lists]

PCP Buidbot and Improving QA

To: pcp@xxxxxxxxxxx
Subject: PCP Buidbot and Improving QA
From: Lukas Berk <lberk@xxxxxxxxxx>
Date: Thu, 07 May 2015 15:49:57 -0400
Delivered-to: pcp@xxxxxxxxxxx
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)
Hey Folks,

I wanted to share something I've set up, to try to help ease and
distribute the quality assurance responsibility among the PCP community.
Despite the fact PCP ships on a variety of distros and operating
systems, my understanding is that only a few core devs actually test the
changes on a variety of platforms with any amount of frequency.
http://buildbot.pcp.io is my attempt to change this, and at the same
time, create more transparency among the community for the current shape
of the testsuite.  The most relevant page to developers will be
http://buildbot.pcp.io/waterfall .  There you can see the latest test
results.

Buildbot offers a distrbuted, modular, design.  Meaning; individuals can
contribute the flavour of buildslave(s) (the machines doing the actual
testing) which matter the most to them.  As an example, I do most of my
work on a fedora, and am more comfortable administering and monitoring
the state for QA purposes. So I've contributed Fedora 20 and 21
buildslaves. Conversely, I'm not as focused on ubuntu or debian distros,
but individuals who are (and use PCP on such platforms) can contribute
buildslaves of that variety.  Thus, increasing the chances there are no
breaking changes that I (or somebody else) has caused and failed to
catch.  This takes the load off any single person to maintain all the
platforms we ship on.

Currently I have the buildbot setup to watch merges to the master
branch, and it tests the set of commits pushed upstream within a one
minute span.  This allows us to keep track how a specific set of commits
effect the testsuite with greater accuracy.  An example would be one of
the latest builds ( http://buildbot.pcp.io/builders/fedora/builds/78 ),
which has changes from both Ken and Martins.  You can see the steps
taken to build this revision, and full testsuite results (
http://buildbot.pcp.io/builders/fedora/builds/78/steps/Run%20Testsuite/logs/stdio
).

I've written several helper scripts for setting up a buildslave, and the
master.cfg being used on buildbot.pcp.io for those interested on
git://sourceware.org/git/pcpfans.git lberk/buildbot They're located in
the qa/buildbot dir.  If these changes could get merged upstream, I
would appreciate it.  I've written further documentation on how to setup
a buildslave at http://pcp.io/buildbot.html Please let me know if
anything requires clarification, or additional steps (I'm sure I've
probably missed something).

If the community as a whole is interested in pursuing and using a
buildbot, I'm happy to continue running the buildmaster.  In that case,
I would humbly ask for those interested to help, setup testing
environments (buildslaves) for the following distros;

Debian
Ubuntu
Solaris
MacOSX
CentOS/RHEL
OpenSuse

The more variety of architectures the better as well.

I Would also need a list of instructions (a recipe if you will) of each
step needed to:

1. build pcp (./Makepgks)
2. Uninstall any previous PCP on themachine/vm with the package manager,
but not the dependencies
3. Install the freshly baked PCP
4. start/enable the required services
5. run the testsuite

An example can be found in the qa/buildbot/master.cfg file for how I've
done so with fedora.

Several questions moving forwards:

1.  What should constitute a 'failed' testsuite run?

Currently on my fedora buildslaves, I'm at 11 failures (which I know is
too high). However, is always having 0 failures too noisy for those
administering the buildslaves?  Should we set a threshold slightly
higher than that?  Should it simply be considered a failure when there
is an increase comparatively to the last run? (not sure how
easy/possible that is to setup) Likewise, I need to lower the number of
[notrun] tests I currently have on those machines.

2.  How should we report (new) failures?

Do we want to at all?  Should it just be an the website for devs to
check?  Buildbot is capable of running an irc bot to inform those one
the #pcp channel.  However I know not everybody uses irc.

3.  Do we want warnings if the ./Makepkgs process has any 'Warnings' in
the output (such as unused vars, etc)?

4.  I'm all out of questions, what questions are there for me? Concerns?

Hope this ends up being useful.

Cheers,

Lukas

<Prev in Thread] Current Thread [Next in Thread>