Some thoughts and quetsions about the data flow infrastracture

Discussion:

Vladimir Makarov

2007-02-12 20:33:17 UTC

On Sunday I had accidentally chat about the df infrastructure on
IIRC. I've got some thoughts which I'd like to share.

I like df infrastructure code from the day one for its clearness.
Unfortunately users don't see it and probably don't care about it.
With my point of view the df infrastructure has a design flaw. It
extracts a lot of information about RTL and keep it on the side. It
does not make the code fast. It would be ok if we got a better code
quality. Danny told me that they have 1.5% better code using df. It
is a really big improvement (about half year work for all compiler
team according to Proebsting's law). IMHO, it could justify promised
5% compiler slowness.

I've checked the branch with last merge point revision 121802 (done
Feb 10) on SPECINT2000 on 2.66 Core2 Duo (-mtune=generic -O2 were
used) and found that there is no improvement but degradation about
0.5% (1888 for the branch vs 1896 for mainline on the branch point. it
is without twolf which is broken for now). The code size is also
about 0.45% bigger on the branch. Getting 0.5% and 11.5% slowness
(308sec vs 275sec for compiling SPECINT2000) does not seems reasonable
especially if we have no alternative faster path (even if the old live
analysis is the mess).

Someone could object me that this infrastructure makes new
optimization implementation easier. But I've heard opinion that rtl
optimizations should be moved to SSA and should be only few
optimizations on RTL as code selection, the insn scheduling and RA.

Even rewriting the current optimizations on the new data flow
infrastructure makes situation worse because it will be not easy to
rid off the data flow infrastructure because probably part of the flaw
in the df interface. So it might create problems in the future.

Especially I did not like David Edelhson's phrase "and no new
private dataflow schemes will be allowed in gcc passes". It was not
such his first expression. Such phrases are killing competition which
is bad for gcc. What if the new specialized scheme is faster. What
if somebody decides to write another better df infrastructure from the
scratch to solve the coming df infrastructure problems.

So could be somebody from steering committee be kind and answer me

Is it (prohibiting to use other dataflow schemes) steering committee
decision?

If it so, should steering committee do such technical decision?
Some people on steering committee are not experts in gcc compiler,
some of them are not experts in this part of the compiler.

I am not in opposition to merge if it satisfies the merge criteria.
People've done a lot of work. It is too late. I've should oppose the
criteria when they were discussed. Sorry I've missed the discussion
if there were such discussion. I am just rising some questions and
saying that more work will be needed for df-infrastructure even after
the merge.

Vlad

Steven Bosscher

2007-02-12 20:50:06 UTC