I feel like that’s only true if I was asked to “write the assembly for this c++ program.” If I’m actually implementing something big in assembly, I’m not going to do 90% of the craziness someone might be tempted to do in c++. Something that is super easy in c++ doesn’t mean it’s easy for the CPU. Writing assembly, I’m going to do what’s easy for the CPU (and efficient) because, now, I’m in the same domain.
The bottom line is cranking up the optimization level can get you a 2-5x win. Using memory efficiently can give you a 10-100x win.
I shouldn’t have used C++ as the example. Even C would work. I agree with everything you’re saying, but the original premise. I think if you put ASM vs C, C++, rust, etc, performance would fall near 50/50.
I’m not the best assembly guy, and I’m not advocating we all write it. But I always felt that the compiler optimization assumption was wrong or weak. Everything would be aligned nicely for my sanity, not performance =]