Avoiding Unnecessary Work We’ve covered low level cache and hardware details which affect software performance. Let’s move to a higher level of abstraction now. There are three general ways to make a program faster: Do less work Do work faster Do the work in parallel (Amdahl’s law)
Great stuff. I didn't know about the likely / __builtin_expect() gcc hint. I see how apt the publication name, Delayed Branch is.
If the cpu encounters branchyFunction() more than a few times (> 1? > 2,3 or 4?) can its branch predictor essentially do the same optimization when it sees that cmp %rax, 0 instruction at that location next time?
Great stuff. I didn't know about the likely / __builtin_expect() gcc hint. I see how apt the publication name, Delayed Branch is.
If the cpu encounters branchyFunction() more than a few times (> 1? > 2,3 or 4?) can its branch predictor essentially do the same optimization when it sees that cmp %rax, 0 instruction at that location next time?