I got baited by this r/rust post on Zig being faster than Rust, which is not the focus of this tidbit, rather this comment in the same post:
After a short excursion, I learned something new!
Inclusive ranges are slower compared to their equivalent exclusive range.
The inclusive range version requires doing additional checks since the upper bound may be the largest value for the integer type. Without the check, the iteration variable will overflow and cause an infinite loop, see also this comment by CAD1997. These checks also result in less optimization opportunities.
Rust
pub fn (: ) > {
let mut = 0;
for _ in 1..( + 1) {
+= std::hint::(1);
}
}
pub fn (: ) > {
let mut = 0;
for _ in 1..= {
+= std::hint::(1);
}
}
Assembly
example::exclusive:
lea rax, [rdi  1]
cmp rax, 3
ja .LBB0_1
xor eax, eax
lea rcx, [rsp  8]
.LBB0_4:
mov qword ptr [rsp  8], 1
add rax, qword ptr [rsp  8]
dec rdi
jne .LBB0_4
ret
.LBB0_1:
xor eax, eax
ret
example::inclusive:
test rdi, rdi
je .LBB1_1
mov ecx, 1
xor eax, eax
lea rdx, [rsp  8]
.LBB1_4:
mov rsi, rcx
cmp rcx, rdi
adc rcx, 0
mov qword ptr [rsp  8], 1
add rax, qword ptr [rsp  8]
cmp rsi, rdi
jae .LBB1_2
cmp rcx, rdi
jbe .LBB1_4
.LBB1_2:
ret
.LBB1_1:
xor eax, eax
ret
Benchmark
The benchmark was done using Criterion:
fn (: &mut criterion::Criterion) {
let mut = .benchmark_group("Iteration");
for in &[256, 512, 1024, 2048, 4096, 8192] {
.bench_with_input(
criterion::BenchmarkId::new("Exclusive", ),
,
,  .iter( exclusive(*)),
);
.bench_with_input(
criterion::BenchmarkId::new("Inclusive", ),
,
,  .iter( inclusive(*)),
);
}
.finish();
}
criterion::criterion_group!(benches, benchmark);
criterion::criterion_main!(benches);
And the output:
As you can see, the exclusive range version performs about twice as fast as the inclusive range version.
b: {unknown}
u64
The 64bit unsigned integer type.
let limit: &i32
src
pub fn exclusive(upper_limit: u64) > u64
src
pub fn inclusive(upper_limit: u64) > u64
let mut sum: u64
upper_limit: u64
c: &mut {unknown}
i: {unknown}
let mut group: {unknown}
src
fn benchmark(c: &mut criterion::Criterion)
f32
A 32bit floatingpoint type (specifically, the “binary32” type defined in IEEE 7542008).
This type can represent a wide range of decimal numbers, like 3.5
, 27
,
113.75
, 0.0078125
, 34359738368
, 0
, 1
. So unlike integer types
(such as i32
), floatingpoint types can represent noninteger numbers,
too.
However, being able to represent this wide range of numbers comes at the
cost of precision: floats can only represent some of the real numbers and
calculation with floats round to a nearby representable number. For example,
5.0
and 1.0
can be exactly represented as f32
, but 1.0 / 5.0
results
in 0.20000000298023223876953125
since 0.2
cannot be exactly represented
as f32
. Note, however, that printing floats with println
and friends will
often discard insignificant digits: println!("{}", 1.0f32 / 5.0f32)
will
print 0.2
.
Additionally, f32
can represent some special values:
 −0.0: IEEE 754 floatingpoint numbers have a bit that indicates their sign, so −0.0 is a possible value. For comparison −0.0 = +0.0, but floatingpoint operations can carry the sign bit through arithmetic operations. This means −0.0 × +0.0 produces −0.0 and a negative number rounded to a value smaller than a float can represent also produces −0.0.
 ∞ and
−∞: these result from calculations
like
1.0 / 0.0
.  NaN (not a number): this value results from
calculations like
(1.0).sqrt()
. NaN has some potentially unexpected behavior: It is not equal to any float, including itself! This is the reason
f32
doesn’t implement theEq
trait.  It is also neither smaller nor greater than any float, making it
impossible to sort by the default comparison operation, which is the
reason
f32
doesn’t implement theOrd
trait.  It is also considered infectious as almost all calculations where one of the operands is NaN will also result in NaN. The explanations on this page only explicitly document behavior on NaN operands if this default is deviated from.
 Lastly, there are multiple bit patterns that are considered NaN.
Rust does not currently guarantee that the bit patterns of NaN are
preserved over arithmetic operations, and they are not guaranteed to be
portable or even fully deterministic! This means that there may be some
surprising results upon inspecting the bit patterns,
as the same calculations might produce NaNs with different bit patterns.
This also affects the sign of the NaN: checking
is_sign_positive
oris_sign_negative
on a NaN is the most common way to run into these surprising results. (Checkingx >= 0.0
orx <= 0.0
avoids those surprises, but also how negative/positive zero are treated.) See the section below for what exactly is guaranteed about the bit pattern of a NaN.
 It is not equal to any float, including itself! This is the reason
When a primitive operation (addition, subtraction, multiplication, or division) is performed on this type, the result is rounded according to the roundTiesToEven direction defined in IEEE 7542008. That means:
 The result is the representable value closest to the true value, if there is a unique closest representable value.
 If the true value is exactly halfway between two representable values, the result is the one with an even leastsignificant binary digit.
 If the true value’s magnitude is ≥
f32::MAX
+ 2^{(f32::MAX_EXP − f32::MANTISSA_DIGITS − 1)}, the result is ∞ or −∞ (preserving the true value’s sign).  If the result of a sum exactly equals zero, the outcome is +0.0 unless
both arguments were negative, then it is 0.0. Subtraction
a  b
is regarded as a suma + (b)
.
For more information on floatingpoint numbers, see Wikipedia.
See also the std::f32::consts
module.
NaN bit patterns
This section defines the possible NaN bit patterns returned by floatingpoint operations.
The bit pattern of a floatingpoint NaN value is defined by:
 a sign bit.
 a quiet/signaling bit. Rust assumes that the quiet/signaling bit being set to
1
indicates a quiet NaN (QNaN), and a value of0
indicates a signaling NaN (SNaN). In the following we will hence just call it the “quiet bit”.  a payload, which makes up the rest of the significand (i.e., the mantissa) except for the quiet bit.
The rules for NaN values differ between arithmetic and nonarithmetic (or “bitwise”)
operations. The nonarithmetic operations are unary 
, abs
, copysign
, signum
,
{to,from}_bits
, {to,from}_{be,le,ne}_bytes
and is_sign_{positive,negative}
. These
operations are guaranteed to exactly preserve the bit pattern of their input except for possibly
changing the sign bit.
The following rules apply when a NaN value is returned from an arithmetic operation:

The result has a nondeterministic sign.

The quiet bit and payload are nondeterministically chosen from the following set of options:
 Preferred NaN: The quiet bit is set and the payload is allzero.
 Quieting NaN propagation: The quiet bit is set and the payload is copied from any input
operand that is a NaN. If the inputs and outputs do not have the same payload size (i.e., for
as
casts), then If the output is smaller than the input, loworder bits of the payload get dropped.
 If the output is larger than the input, the payload gets filled up with 0s in the loworder bits.
 Unchanged NaN propagation: The quiet bit and payload are copied from any input operand
that is a NaN. If the inputs and outputs do not have the same size (i.e., for
as
casts), the same rules as for “quieting NaN propagation” apply, with one caveat: if the output is smaller than the input, droppig the loworder bits may result in a payload of 0; a payload of 0 is not possible with a signaling NaN (the all0 significand encodes an infinity) so unchanged NaN propagation cannot occur with some inputs.  Targetspecific NaN: The quiet bit is set and the payload is picked from a targetspecific set of “extra” possible NaN payloads. The set can depend on the input operand values. See the table below for the concrete NaNs this set contains on various targets.
In particular, if all input NaNs are quiet (or if there are no input NaNs), then the output NaN is definitely quiet. Signaling NaN outputs can only occur if they are provided as an input value. Similarly, if all input NaNs are preferred (or if there are no input NaNs) and the target does not have any “extra” NaN payloads, then the output NaN is guaranteed to be preferred.
The nondeterministic choice happens when the operation is executed; i.e., the result of a NaNproducing floatingpoint operation is a stable bit pattern (looking at these bits multiple times will yield consistent results), but running the same operation twice with the same inputs can produce different results.
These guarantees are neither stronger nor weaker than those of IEEE 754: IEEE 754 guarantees
that an operation never returns a signaling NaN, whereas it is possible for operations like
SNAN * 1.0
to return a signaling NaN in Rust. Conversely, IEEE 754 makes no statement at all
about which quiet NaN is returned, whereas Rust restricts the set of possible results to the
ones listed above.
Unless noted otherwise, the same rules also apply to NaNs returned by other library functions
(e.g. min
, minimum
, max
, maximum
); other aspects of their semantics and which IEEE 754
operation they correspond to are documented with the respective functions.
When an arithmetic floatingpoint operation is executed in const
context, the same rules
apply: no guarantee is made about which of the NaN bit patterns described above will be
returned. The result does not have to match what happens when executing the same code at
runtime, and the result can vary depending on factors such as compiler version and flags.
Targetspecific “extra” NaN values
target_arch 
Extra payloads possible on this platform 

x86 , x86_64 , arm , aarch64 , riscv32 , riscv64 
None 
sparc , sparc64 
The allone payload 
wasm32 , wasm64 
If all input NaNs are quiet with allzero payload: None. Otherwise: all possible payloads. 
For targets not in this table, all payloads are possible.
u32
The 32bit unsigned integer type.
core::hint
pub const fn black_box<T>(dummy: T) > T
An identity function that hints to the compiler to be maximally pessimistic about what
black_box
could do.
Unlike [std::convert::identity
], a Rust compiler is encouraged to assume that black_box
can
use dummy
in any possible valid way that Rust code is allowed to without introducing undefined
behavior in the calling code. This property makes black_box
useful for writing code in which
certain optimizations are not desired, such as benchmarks.
Note however, that black_box
is only (and can only be) provided on a “besteffort” basis. The
extent to which it can block optimisations may vary depending upon the platform and codegen
backend used. Programs cannot rely on black_box
for correctness, beyond it behaving as the
identity function. As such, it must not be relied upon to control critical program behavior.
This also means that this function does not offer any guarantees for cryptographic or security
purposes.
When is this useful?
While not suitable in those missioncritical cases, black_box
’s functionality can generally be
relied upon for benchmarking, and should be used there. It will try to ensure that the
compiler doesn’t optimize away part of the intended test code based on context. For
example:
fn contains(haystack: &[&str], needle: &str) > bool {
haystack.iter().any(x x == &needle)
}
pub fn benchmark() {
let haystack = vec!["abc", "def", "ghi", "jkl", "mno"];
let needle = "ghi";
for _ in 0..10 {
contains(&haystack, needle);
}
}
The compiler could theoretically make optimizations like the following:
 The
needle
andhaystack
do not change, move the call tocontains
outside the loop and delete the loop  Inline
contains
needle
andhaystack
have values known at compile time,contains
is always true. Remove the call and replace withtrue
 Nothing is done with the result of
contains
: delete this function call entirely benchmark
now has no purpose: delete this function
It is not likely that all of the above happens, but the compiler is definitely able to make some
optimizations that could result in a very inaccurate benchmark. This is where black_box
comes
in:
use std::hint::black_box;
// Same `contains` function
fn contains(haystack: &[&str], needle: &str) > bool {
haystack.iter().any(x x == &needle)
}
pub fn benchmark() {
let haystack = vec!["abc", "def", "ghi", "jkl", "mno"];
let needle = "ghi";
for _ in 0..10 {
// Adjust our benchmark loop contents
black_box(contains(black_box(&haystack), black_box(needle)));
}
}
This essentially tells the compiler to block optimizations across any calls to black_box
. So,
it now:
 Treats both arguments to
contains
as unpredictable: the body ofcontains
can no longer be optimized based on argument values  Treats the call to
contains
and its result as volatile: the body ofbenchmark
cannot optimize this away
This makes our benchmark much more realistic to how the function would actually be used, where arguments are usually not known at compile time and the result is used in some way.