Dec. 1, 2006, 10:35 a.m.

C Does Not Mean Fast

I keep getting into arguments where people say, Well, if we need it fast, we'll write it in C. C programs can be fast, but it often takes a lot more effort, and it's not always clear what that effort is.

This reminds me of a story a mentor once told me. He was a mainframe programmer at some large company that holds all of the details of your day-to-day life. Back then, everyone programmed in assembler, but he was a wild man. He thought people should be working in C. The other programmers scoffed at him and pointed out that he was a fool for thinking he could do better than assembler. I mean, it's machine level, there's no overhead! But he did it anyway, and with a fraction of their code, his app ran a few times faster.

There are numerous articles out there that point out similar results. Mark C. Chu-Carroll's The "C is Efficient" Language Fallacy provides some interesting insight. You can also browse around at one of the many shootout sites such as debian's P4 Sandbox or Gentoo's interactive results. In many of these shootout sites, you'll see C compilers sitting around the top on average. However, the difference between those results and the languages that follow are not terribly large.

The important thing to note is that at the highest level aggregate, these are average results on micro-benchmarks. If you look at results from individual tests, C isn't always in the lead. Sometimes it falls behind quite a bit, in fact. The question is whether it falls behind for your app.

However, that question is more complicated than it seems at first. There are many options that control compiler optimizations (speaking, at least, for gcc). It's possible that a particular optimization that used to be beneficial to your application may become detrimental to your application with the addition of as little as one line of code.

-funroll-loops is a good example of this. It's not enabled by default, even with -O3 (or -O6 as I apparently like to call it). Unrolling a loop is useful if the loop is small enough to fit in a processor cache. It is a terrible idea otherwise. You may have a case where code sped up considerably by unrolling loops, and then you increase the code size just large enough to blow the cache, and it all comes crumbling down.

Just for a comparison of what I happen to have handy right now and what fits the arguments I've been getting into, let's consider my C runtime vs. my java runtime. For these tests, I have a C program and a java program that are just about as identical as I could make them and I'm going to run a couple of different scenarios through here.

Test Scenarios

C Compiled with -O6 -funroll-loops

dustintmb:/tmp 568% ./perftest 10000 100000000
125.486u 0.599s 2:07.23 99.0%   0+0k 0+0io 0pf+0w

C Compiled with -O6

Note that in this test I did not unroll loops. This had a severe performance impact.

dustintmb:/tmp 569% ./perftest 10000 100000000
1003.843u 5.058s 17:01.43 98.7% 0+0k 0+1io 0pf+0w

Java (assertions enabled)

dustintmb:/tmp 570% java -ea -server PerfTest 10000 100000000
128.864u 0.506s 2:10.12 99.4%   0+0k 0+22io 0pf+0w

Conclusion

While I had to spend a bunch of effort telling the C compiler what kinds of optimizations to make that would fit this code the best, java just kind of figured it out on the fly. That is, it could theoretically unroll some loops and not unroll some others at runtime and I wouldn't know or care.

You can't conclude anything from these tests other than that in this scenario, C isn't faster enough than Java to be a factor in choosing a language without considering what the task is.

And that really is the point. Run your own tests and figure out whether it makes a difference to you. You may be surprised.

Table of Contents

Feeds