1 Oct 2006 01:27
gcc optimizations
John Richard Moser <nigelenki <at> comcast.net>
2006-09-30 23:27:29 GMT
2006-09-30 23:27:29 GMT
It looks like the redundancy analyzer in gcc is broken. My first hint was that after 6 hours of work I finally narrowed down some interesting performance measurements using nbench[1]* to Partial Redundancy Elimination; my second was popping into #gcc on OFTC and being told the RA is "stupid" and doesn't work so hot on x86. It turns out that -O2 turns -ftree-pre on by default, and -Os turns it off; so certain operations were quite awesomely faster with -Os (and others slower). Fiddling with tons of optimizations I eventually noticed the SAME hit -O2 takes in ONE optimization; I disabled this during an -O2 compile of nbench and lo and behold the numbers looked a lot better. Details on the Wiki at [2]. Somebody should plug some real-world application into this, like rendering a JPEG image, Web page, or Ogg Vorbis file and measuring how long this takes in real time using both '-O2' and '-O2 -fno-tree-pre'. Some code seems slightly slower, while most code seems significantly faster. [1]http://www.tux.org/~mayer/linux/bmark.html [2]http://wiki.laptop.org/go/User:Bluefoxicy/gcc_optimizations *nbench is highly CPU intensive, it does no IO benchmarks. The CPU will get hot. I compiled it on a tmpfs mounted on /tmp/x to avoid touching flash. -- All content of all messages exchanged herein are left in the Public Domain, unless otherwise explicitly stated.(Continue reading)
RSS Feed