pro64-support
[Top] [All Lists]

Re: CFG by daVinci (fwd)

To: Shukang ZHOU <zhshk@xxxxxxxxx>, sgi <pro64-support@xxxxxxxxxxx>, Peng Zhao <pengzhao@xxxxxxxxxxxxxx>
Subject: Re: CFG by daVinci (fwd)
From: David Stephenson <dlstephe@xxxxxxx>
Date: Sun, 10 Jun 2001 15:33:44 -0700
Organization: SGI -- Compilers
References: <Pine.LNX.4.21.0106081534510.30011-201000@peers.cs.ualberta.ca> <008d01c0f0c0$30240d30$280379c8@zsk>
Sender: owner-pro64-support@xxxxxxxxxxx
Feedback frequencies as stored in memory using three different
formats, for different phases of the compiler.  WOPT and CG each
have their own control flow graph structures, which are annotated
with the frequency data.  Within the other compiler phases, no
CFG is maintained; the WHIRL representation does not come with a
build in CFG.

Periodically, frequencies that annotate the WHIRL code are updated
by propagation.  Where possible, unknown frequencies are replaced
by known frequencies, and GUESS frequencies are replaced by EXACT
frequencies.  This propagation requires a CFG, but WHIRL has no
CFG, so a temporary CFG (the class FB_CFG) is constructed from the
WHIRL code just while propagating frequencies.  FB_CFG has some
unique features, so it doesn't quite correspond to any of the CFGs
used within WOPT or CG.

In general, each frequency value in the annotated WHIRL has one
block associated with in.  In Peng Zhao's example, the "if" WHIRL
node has two frequency values assigned to it: a count of the number
of times that the branch was taken during the execution of the
instrumented binary, and a count of the number of times that the
branch was not taken.  A CALL node may be associated with either
one or two frequency counts (depending on whether it is known that
the CALL always returned).

        Block 0    Function entry
        Block 1    PRAGMA
        Block 2    IF (taken)
        Block 3    IF (not taken)
        Block 4    CALL printf
        Block 5    CALL printf
        Block 6
        Block 7    RETURN

The merge block 6 doesn't correspond to any particular WHIRL node,
but is required to complete the graph.

In the graph displayed by daVinci, the block numbers only correspond
to the block numbers within FB_CFG.  The labels indicate the total
incoming and outgoing frequencies for each node.
Green blocks indicate that the incoming and outgoing frequencies
differ.  Red blocks indicate uninitialized or

The color of the blocks help detect errors by highlighting blocks
where the frequencies are uninitialized or don't agree with
neighboring blocks.


Peng Zhao wrote:

> I have a question about the CFG drew by daVinci.
>
> Enclosed are two files, one is a very simple c program and the
> other one is the corresponding CFG just after annotation. It is
> a little strange for me.


Shukang ZHOU wrote:

> To my best understanding, each block in the graph represents for a FB_NODE.
> And the content in the block are ID, node_type, incoming frequency, and
> outgoing frequency. "-------" is the node_type, standing for that FB_NODE
> node_type is FB_EDGE_UNINT (0).
> 
> Maybe your graph is not a CLASSIC cfg, in which a node should be a basic
> block. So there are more blocks than your expectation. And the number in
> block starting from 0 can be also understood this way.

Exactly right.


Peng Zhao wrote:

> First, block 2 and 4 should be one (basic block) and 3&5 too. Why
> draw them two separated blocks.

They correspond to two different frequency values, for two different
WHIRL nodes.  Blocks 2 and 3 correspond to the IF node, while blocks
4 and 5 each correspond to one of the printf CALL nodes.


> What is the meaning of block 6 ( especially: the meanning of
> "-----")? Does it mean the "return" statement in the program?
>
> What does the block 7 stand for?

Block 7 corresponds to the return statement.  Block 6 is a merge
block that doesn't correspond to any WHIRL node, so "-----" means
"none".


> BTW1: what is the relationship between the ID of each block in
> the CFG drew by daVinci and the corresponding BB_NODE->Id()?

Not much.  The class BB_NODE is part of the WOPT CFG structure,
but the graph drawn by daVinci is from the FB_CFG structure, used
outside of WOPT AND CG.


> BTW2: What is COMMA, RCOMMA, MU and CHI?

COMMA and RCOMMA are WHIRL operators described in the WHIRL
reference document whirl.pdf available at

        http://oss.sgi.com/projects/Pro64/download.html

MU and CHI nodes are annotations to the Single Static Assignment
(SSA) form within WOPT used to represent aliasing info.  For
more information, check out the research paper

        Fred C. Chow, Sun Chan, Shin-Ming Liu, Raymond Lo, and
        Mark Streich.  Effective Representation of Aliases and
        Indirect Memory Operations in SSA Form. In Proceedings of
        the Sixth International Conference on Compiler Construction,
        Lecture Notes in Computer Science, Vol.1060, pages 253-267,
        Springer, April 1996.

available at:

        http://citeseer.nj.nec.com/395004.html

                                                        - David

-- 
David Stephenson        http://reality.sgi.com/dlstephe_engr/

<Prev in Thread] Current Thread [Next in Thread>