ZFR: No argument from me there.
But also remember that "speed" is not the be-all and end-all of a program. Unless you're working on a dedicated software where every cycle counts, or for a competition or similar, sometimes it's better (for example for general code maintenance) to have a general "slower" but cleaner piece of code in a real software. If a fast function uses lots of gotos and ends up as awful spaghetti code that's difficult to modify later, then I'd rather have slower neater one.
First thing i notice is that people equate speed with "readability issues." Usually this results from resorting to "tricks" to make them happen, like my pass by reference trick, which would've thrown most people off. And i understand the need to focus on readability over speed, but if there's a function that is used alot and isn't going to be maintained much, you should focus on speed and size. Pareto distribution: 20% of the code will be used 80% of the time, so if you want to control your CPU requirements, changes made to that 20% will affect 80% of your requirements.
Space optimization is another animal, but to have the greatest effect on that, what i see going wrong the most would not be the result of readability, but a lack of understanding of the tools, or the tools having design flaws. I'd have to look on a case by case basis to elaborate.
If I see (n%2)==0 in code, I immediately think it's testing for integer parity, if I see n&=1 then I immediately think it's testing for the last bit. If this is part of a larger piece of code, then the first one makes it immediately clear for me what the author meant and why he added it, while with the later I might have to pause and think for a second before realizing why it's used.
This is due to the lack of it's usage. Since i naturally test parity with &, when i see %2, my mind immediately goes through the division remainder first, then i remember why we're getting a remainder from division from 2, rather than dividing by 2, which is to check parity. My logic is simply "I use % when trying to narrow a number of choices not governed by powers of 2." Knowing the power of 2 thing, if i am designing something with borders that have to be "around" a number, rather than actually having to be exactly a number, i try to land it on a power of 2, just so i can throw it into the &, especially if it's something i need alot of (plotting random stars on a background, for example). Not going to save much if a complex RNG is involved, but if you're hand-coding a custom RNG since the randomness doesn't matter as long as the resultant pattern isn't too obvious, you can benefit alot (moreso from your custom RNG than your sneaky border trick).
Of course in this example it's as simple as adding a single comment:
// testing for parity and this is the fastest method
But in general it might not be the case. I've seen my share of convoluted code that might be faster but a nightmare to maintain. Maybe faster, but at what cost.
My experience with convoluted code isn't the result of optimization, but of people trying despirately to make things work within a certain time frame, then forgetting that they declared 3 variables that they aren't using anymore, deleted a conditional for debugging, then readded it, but didn't take the brackets away from the first time they removed the conditional, to all sorts of silly things. Usually optimization is unreadable because it comes with obscure practices like the "pass by reference" trick, or sticking the word "inline" in front of a function when half the people don't even know what that means.
kohlrak: Maybe, but most of those target VMs as well, which is an important distinction. You'll find that sizes aren't specified where there are not int types (like BASIC). My point was that 32bit "int" isn't a reliable standard, which was my point. Whereas there are other standards that are reliable (like the one in my intended solution).
dtgreene: Well, C and C++ do have a standard header that declares types like uint32_t, which is useful if you need an integer of a specific size (in this case, an unsigned 32-bit integer).
http://en.cppreference.com/w/cpp/types/integer Rust, which does not run in a VM (it runs on bare metal, and has even been used for bare-metal embedded development), uses types like that as the only integer types (u32 being the equivalent of C uint32_t).
With C and C++, you do run into the risk though that the header isn't available. The library is not necessary for C/C++ compliance, which is important if you're working with microcontrollers like ATMega 328P-PU, CORTEX-M, etc. The standard library is nice, but you should always know how to live without it, just in case you're given a project where you have to.
Rust, i'm honestly not familiar with. Given that level of typing, you now have me interested enough to google it. I like low level things, and C/C++ is starting to gravitate way too far from the metal for my needs.