1 Jan 2003 15:44
gcc optimizes loops badly.
Joakim Tjernlund <Joakim.Tjernlund <at> lumentis.se>
2003-01-01 14:44:51 GMT
2003-01-01 14:44:51 GMT
I have spent some time to optimize the crc32 function since JFFS2 uses it heavily. I found that
gcc 2.95.3 optimizes loops badly, even gcc 2.96 RH produces better code for x86 in some cases.
So I optimized the C code a bit and got much better results.
Now I wounder how recent(>= 3.2) gcc performs. Could somebody run gcc -S -O2 -mregnames on
functions below and mail me the results?
Jocke
These are different version of the same crc32 function:
#include <linux/types.h>
extern const __u32 crc32_table[256];
/* Return a 32-bit CRC of the contents of the buffer. */
__u32 crc32org(__u32 val, const void *ss, unsigned int len)
{
const unsigned char *s = ss;
while (len--){
val = crc32_table[(val ^ *s++) & 0xff] ^ (val >> 8);
}
return val;
}
__u32 crc32do_while(__u32 val, const void *ss, unsigned int len)
{
const unsigned char *s = ss;
(Continue reading)
The original code simply accessed the 'phy_status' in the data structure
as a volatile object. The modification from Wolfgang makes any data
structure access volatile, and then updates the 'phy_status' only once
at the end.
If it makes something work better for Wolfgang, that's fine
RSS Feed