1 Jun 2005 01:08
[RFC] Early inlining pass
Jan Hubicka <jh <at> suse.cz>
2005-05-31 23:08:50 GMT
2005-05-31 23:08:50 GMT
Hi,
this patch adds early inlining pass that inline the "obvious" stuff (ie
functions having body smaller than the call construct and always inline
functions). This is followed by cleanup_cfg and true inlining pass.
This helps in scenarios:
1) With tree profiling we no longer have counter per each inlined call.
This makes surprising difference in tramp3d where we have few hounders
dummy call for each usefull instruction executed, so execution times with
-fprofile-generate -ftree-based-profiling goes down from 10minutes to 6
seconds, RTL profiling is at 4.5 seconds.
2) Inlining of body we already inlined into is slightly cheaper than inlining
inlining the body and it's callers all the time. This seems to account by
up to 11% compile time difference for Gerald's testcase (but inlining
decisions are affected too so it is dificult to judge, performance and code
size however remains (almost) unchanged. It also seems to account in SPEC
build and GCC bootstrap as bootstrap time seems improved by about 30-100
seconds (even if this close to noise factor).
3) We have better chances that pre-inline optimization would be usefull for
C++. I experimented with this on tree-profiling branch and while it does
resonable (ie measurable) job for SPEC2000, it is hardly usefull for C++
as the tiny functions are almost unoptimizable in isolation.
And now the problems - I am not sure how to interface properly to the IPA pm.
At the moment I have pass early_local_passes that is IPA pass and the local
passes (profiling and cleanup_cfg currently) are run as it's subpasses. I am
letting passmanger to switch from IPA to non-IPA by iterating over all
functions whenever there is subpass of any IPA pass. I am not sure if this is
way we want to go (perhaps we might add some IPA property to make this
(Continue reading)
RSS Feed