Trust your compiler: Modern C++

7/5/2026 at 1:52:42 PM

Every time I see "use ranges and algorithms!" examples, I am baffled that apparently, I am supposed to find

    inline double algorithm_call(std::span<double const> xs) noexcept {
        return std::accumulate(
            xs.begin(),
            xs.end(),
            0.0,
            [](double acc, double volts) {
                auto mv  = calibrated_mv(volts);
                auto err = residual(mv);
                return weighted_square(err) + acc;
        });
    }

more readable, concise, and easier on my eyes than

    inline double raw_loop(std::span<double const> xs) noexcept {
        double sum = 0.0;

        for (double volts : xs) {
            auto mv  = calibrated_mv(volts);
            auto err = residual(mv);
            sum += weighted_square(err);
        }

        return sum;
    }

Sure, there are some algorithms in <algorithms> that I'm rather not reimplement myself, but this one is not it.

by Joker_vD

7/5/2026 at 4:19:20 PM

You said "ranges and algorithms", but you didn't copy the third function which actually uses <range> library.

inline double ranges_pipeline(std::span<double const> xs) noexcept { auto costs = xs | std::views::transform(calibrated_mv) | std::views::transform(residual) | std::views::transform(weighted_square);

  return std::ranges::fold_left(costs, 0.0, std::plus<double>{});

}

It's still a bit verbose, because C++ doesn't allow universal function call syntax. It will be even more concise in other languages like D.

by Erlangen

7/5/2026 at 5:28:00 PM

That version was so much more opaque that I didn't bother copying that. Again, I'm not entirely sure why people are so enamored with splitting iteration itself from the contents of one iteration step, especially since the loops are language built-ins.

by Joker_vD

7/5/2026 at 2:07:18 PM

The first form is easier to send to 32 beefy cores or 1024 small CPUs or a Beowulf cluster or a GPU or people sitting in a room.

by rzzzt

7/5/2026 at 2:29:52 PM

Both of them have to be completely rewritten to make use of multiprocessing, so what exactly is the advantage?

by xyzzyz

7/5/2026 at 2:47:52 PM

The original example isn't really using ranges except to emulate C++98 iterator work though.

The actual equivalent might be something closer to:

    inline double algorithm_call(std::span<double const> xs) noexcept {
        return std::accumulate(
            xs, 0.0,
            [](double acc, double volts) {
                auto mv  = calibrated_mv(volts);
                auto err = residual(mv);
                return weighted_square(err) + acc;
        });
    }

(that is, without the boilerplate .begin and .end).

Even that is enough to make ranges useful in my mind, but in a codebase which has started to integrate some functional programming techniques, there are also applications for things like views and transforms.

This can make it easier to reason about iteration pipelines in ways you might already be familiar with from POSIX.

That all said, it's C++ so sometimes the error messages get a lot more 'interesting' than they would have with STL-style iterators, especially when mixed with constexpr expressions as you might do with std::format or fmt libs.

by mpyne

7/5/2026 at 2:38:54 PM

The first one too? Isn't that the map-reduce fork-join golden example of multiprocessing?

by rzzzt

7/5/2026 at 2:47:10 PM

`std::accumulate` is defined to have sequential semantics, so the analysis required to make it parallel is probably not that different than starting from the loop version. I guess you could have an alternate `accumulate_associative` that uses the same interface but assumes the reduction is associative and has unspecified evaluation order?

by cwzwarich

7/5/2026 at 2:51:45 PM

C++ has std::reduce for that, which is std::accumulate except it's defined to operate without any specific ordering.

by mpyne

7/5/2026 at 5:35:33 PM

And now you should probably also stop and consider whether adding elements one-by-one as opposed to recursively adding together sums of smaller subarrays has better or worse numerical behaviour in regards to e.g. rounding and stability.

by Joker_vD

7/5/2026 at 3:11:39 PM

Thanks everyone, my C++ knowledge has been greatly expanded today.

by rzzzt

7/5/2026 at 2:50:57 PM

std::accumulate is sequential and guarantes in order traversal. std::reduce is parallel version of it

by CITIZENDOT

7/5/2026 at 2:49:58 PM

1) afaik accumulate cannot be parallelized

2) the map part is included in the accumulate lambda, so the map part cannot be parallelized either -> you'd have to split it out into a transform step (iirc)

by tcfhgj

7/5/2026 at 5:32:31 PM

It's been 15 years since I've last touched OpenMP, but the second form is trivially parallelizable as well. Besides, this parallelization can only ever properly work with arrays/vectors or, at the very worst, std::deque as its usually implemented (a vector of fixed-length arrays), not with e.g. linked lists or red-black trees, so why even bother with generic spans and algorithms?

by Joker_vD

7/5/2026 at 6:29:54 PM

For compilation?

by never_inline

7/5/2026 at 3:57:42 PM

Great, now use some functions. From the library or your own, and see this complexity become manageable.

That's what abstraction is about.

by fooker

7/5/2026 at 4:02:16 PM

Don't trust your compiler. Your code is only fast if you're lucky.

https://tiki.li/blog/lucky_code.html

by chrka

7/5/2026 at 5:48:54 PM

I agree you can't trust your compiler, but you can control its behavior more reliably with __builtin_expect_with_probability

https://github.com/protocolbuffers/protobuf/commit/9f29f02a3...

by charleslmunger

7/5/2026 at 12:57:20 PM

Trust the compiler - sure - but we can't change the whole program by using -ffast-math, unfortunately, so that particular one is out.

by kzrdude

7/5/2026 at 3:18:10 PM

I like the Rust approach of adding operations like `algebraic_add` instead of supporting a compiler flag. This avoids undefined behaviour and keeps the complications from optimizations localized to code using these.

https://doc.rust-lang.org/std/primitive.f32.html#algebraic-o...

> Algebraic operators of the form a.algebraic_*(b) allow the compiler to optimize floating point operations using all the usual algebraic properties of real numbers – despite the fact that those properties do not hold on floating point numbers. This can give a great performance boost since it may unlock vectorization.

> The exact set of optimizations is unspecified but typically allows combining operations, rearranging series of operations based on mathematical properties, converting between division and reciprocal multiplication, and disregarding the sign of zero. This means that the results of elementary operations may have undefined precision, and “non-mathematical” values such as NaN, +/-Inf, or -0.0 may behave in unexpected ways, but these operations will never cause undefined behavior.

> Because of the unpredictable nature of compiler optimizations, the same inputs may produce different results even within a single program run. Unsafe code must not rely on any property of the return value for soundness. However, implementations will generally do their best to pick a reasonable tradeoff between performance and accuracy of the result.

by CodesInChaos

7/5/2026 at 4:10:51 PM

I appreciate the semantics and locality of that, too. When you glance at it, you understand that specific tradeoffs are happening right here, and here only, without some CLI arg changing them for the entire program. It’s kinda like unsafe, but for math.

by kstrauser

7/5/2026 at 1:28:35 PM

I really dislike the complexity of modern C++ language specs, but does it obscure much detail about FP ops?

TL;DR:

A vast majority of the programmers I've worked with don't understand the nuances of FP in general, nor the various extents of IEEE-754 support in different programming languages.

So for important numerical programming, I think clarity regarding the FP operations being performed can be crucial. I'm just unclear if modern C++ is a significant factor for that.

by CoastalCoder

7/5/2026 at 1:17:37 PM

> Virtual vs static polymorphism

> std::visit over std::variant<A, B, C> is lowered to a switch over the active alternative.

> In this case, layout is probably doing more work than the dispatch mechanism itself.

Very likely because last time I checked visit lowers to a virtual call.

by mike_hock

7/5/2026 at 3:04:35 PM

Unremarked: debug build perf, perf-stability against minor edits, build-time bloat when heavily using std templates.

by mwkaufma

7/5/2026 at 1:44:07 PM

> exceptions are slow

There are proposals to introduce better exceptions into C++. Like this: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p07....

But until it's not in the standard, people should use std::expceted instead.

by Panzerschrek

7/5/2026 at 1:02:55 PM

I’ve seen some terrible horrid nonsense from them and even the best compilers don’t use a third of the opcodes our modern CPUs boast of. Nobody understands the big compilers any more either, they’re all too huge. And soon AI will be “improving” hem too.

You want to see a beautiful compiler? Look at Plan 9’s compiler suite. A man could understand and even build on that.

by Glandalf

7/5/2026 at 2:46:10 PM

> even the best compilers don’t use a third of the opcodes our modern CPUs boast of

That’s not necessarily an indication of the weakness of compilers. It also could be an indication that hardware designers could leave out instructions.

X86, in particular, will have lots of them for backwards compatibility reasons (extreme example: the old 80-bit x87 FP stack)

There also are instructions that are expected to never get used by ‘normal’ compilers but cannot be removed because they only make sense in lower-level code such as those for switching between protection levels, implementing compare-and-swap, etc.

by Someone

7/5/2026 at 3:24:24 PM

x87 support may not be the most obscure part of the instruction set. Ther is also hardware support for BCD math in 16 bit amd 32 bit mode. Who uses that anymore?

by gmueckl

7/5/2026 at 4:22:55 PM

Unfortunately some exchanges (twse) uses packed BCD encoding.

by ks6g10

7/5/2026 at 2:36:41 PM

How does the resulting code compared to what a modern compiler gives me. I don't maintain compilers for a living, I maintain other code, which is ultimately longer and more complex than a C++ compiler. And so if my compiler, by becoming a little bit more complex, can make my resulting code a lot simpler because I don't have to do inline optimizations of various sorts, that makes my life much easier and is a good trade-off since there's a lot more programs in the world than there are compilers.

by bluGill

7/5/2026 at 12:48:13 PM

Are you a fool?

Another name for compilers: invisible backdoor injectors. The more complex is the syntax the more it is likely to happen... I let you guess how the "sane" syntax from c++ and similar (LOL) does fit here...

by sylware

7/5/2026 at 6:34:01 PM

DJ Bernstein seems to agree with you: https://blog.cr.yp.to/20240803-clang.html

by never_inline

7/5/2026 at 1:28:41 PM

Quite funny comment on the vibe coding age.

by pjmlp

7/5/2026 at 2:39:44 PM

Quit poking at the openbsd maintainers. Jokes aside (I mean maybe they are one I don't know), it is at least a coherent opinion that inherently complex but critical software infrastructure would ideally be kept as simple and understandable as possible with all the correctness and verification apparatus staying out of the way so you can see what is there to be backdoored. I use rust primarily and like using it, but there are well over a hundred crates just in the front end, and llvm isn't simple. I do miss the days when I could know what each line did.

by galangalalgol

7/5/2026 at 2:32:45 PM

And yours is in no way related to mine...

by sylware

7/5/2026 at 2:34:01 PM

Complaining about C++ compilers given the amount of increasing vibe code garbage and related hallucinations, certainly is.

by pjmlp

7/5/2026 at 2:37:35 PM

Oh!

You meant it is even worse nowadays with vibe coding! My bad.

by sylware

7/5/2026 at 2:59:21 PM

What has complex code got to do with it?

Trusting trust was based on old C. You don't get much more minimal than that.

by benj111