3/8/2026 at 5:47:20 AM
The performance observation is real but the two approaches are not equivalent, and the article doesn't mention what you're actually trading away, which is the part that matters.The C++11 threadsafety guarantee on static initialization is explicitly scoped to block local statics. That's not an implementation detail, that's the guarantee.
The __cxa_guard_acquire/release machinery in the assembly is the standard fulfilling that contract. Move to a private static data member and you're outside that guarantee entirely. You've quietly handed that responsibility back to yourself.
Then there's the static initialization order fiasco, which is the whole reason the meyers singleton with a local static became canonical. Block local static initializes on first use, lazily, deterministically, thread safely. A static data member initializes at startup in an order that is undefined across translation units. If anything touches Instance() during its own static initialization from a different TU, you're in UB territory. The article doesn't mention this.
Real world singleton designs also need: deferred/configuration-driven initialization, optional instantiation, state recycling, controlled teardown. A block local static keeps those doors open. A static data member initializes unconditionally at startup, you've lost lazy-init, you've lost the option to not initialize it, and configuration based instantiation becomes awkward by design.
Honestly, if you're bottlenecking on singleton access, that's design smell worth addressing, not the guard variable.
by halayli
3/8/2026 at 7:30:48 AM
> Honestly, if you're bottlenecking on singleton access, that's design smell worth addressing, not the guard variable.There's a large group of engineers who are totally unaware of Amdahl's law and they are consequently obsessed with the performance implications of what are usually most non-important parts of the codebase.
I learned that being in the opposite group of people became (or maybe has been always) somewhat unpopular because it breaks many of the myths that we have been taught for years, and on top of which many people have built their careers. This article may or may not be an example of that. I am not reading too much into it but profiling and identifying the actual bottlenecks seems like a scarce skill nowadays.
by menaerus
3/8/2026 at 9:22:17 AM
You leveled up past a point a surprising number of people get stuck on essentially.I feel likethe mindset you are describing is kind of this intermediate senior level. Sadly a lot of programmers can get stuck there for their whole career. Even worse when they get promoted to staff/principal level and start spreading dogma.
I 100 percent agree. If you can't show me a real world performance difference you are just spinning your wheels and wasting time.
by PacificSpecific
3/9/2026 at 10:02:32 AM
Yes, I agree, and my experience is the same - there's just too many folks getting stuck in that mindset and never leaving it. Looking into the history I think software engineering domain has a lot of cargo-cult, which is somewhat surprising given that people who are naturally attracted to this domain are supposed to be critical thinkers. It turns out that this may not be true for most of the time. I know that I was also afoul of that but I learned my lesson.by menaerus
3/8/2026 at 3:12:59 PM
On the flip side, it’s easy to get a bit stuck down the road by the mere fact that you have a singleton. Maybe you have amazing performance and very carefully managed safety, but you still have a single object that is inherently shared by all users in the same process, and it’s very very easy to end up regretting the semantic results. Been there, done that.by amluto
3/8/2026 at 11:02:59 AM
Worse, while shipping Electron crap is the other extreme, not everything needs to be written to fit into 64 KB or 16ms rendering frame.Many times taking a few extra ms, or God forbid 1s, is more than acceptable when there are humans in the loop.
by pjmlp
3/8/2026 at 12:02:06 PM
agreed. Strong emphasis on "profiling and identifying the actual bottleneck". Every benchmark will show a nested stack of performance offenders, but a solid interpretation requires a much deeper understanding of systems in general. My biggest aha moment yrs ago was when I realized that removing the function I was trying to optimize will still result in a benchmark output that shows top offenders and without going into too many details that minor perspective shift ended up paying dividends as it helped me rebuild my perspective on what benchmarks tell us.by halayli
3/9/2026 at 9:52:05 AM
Yeah ... and so it happens that this particular function in the profile is just a symptom, merely being an observation (single) data point of system behavior under given workload, and not the root cause for, let's say, load instruction burning 90% of the CPU cycles by waiting on some data from the memory, and consequently giving you a wrong clue about the actual code creating that memory bus contention.I have to say that up until I grasped a pretty good understanding of CPU internals, memory subsystem, kernel, and generally the hardware, reading into the perf profiles was just a fun exercise giving me almost no meaningful results.
by menaerus
3/8/2026 at 1:21:27 PM
>Then there's the static initialization order fiascoOne of the reasons I hate constructors and destructors.
Explicit init()/deinit() functions are much better.
by cv5005
3/8/2026 at 12:40:34 PM
The fact that he calls the generated code good/bad without discussing the semantic differences tells that the original author doesn't really know what he is talking about. That seems problematic to me as he is selling c++ online course.by Rexxar
3/8/2026 at 6:12:57 AM
[dead]by alex_dev42
3/8/2026 at 6:38:45 AM
Yes definitely not dismissing the lock overhead, but I wanted to bring attention to the implicit false equivalence made in the post. That said, I am surprised the lock check was showing up and not the logging/formatting functions.by halayli
3/8/2026 at 6:09:54 AM
[flagged]by csegaults
3/8/2026 at 6:28:29 AM
a real human. threads can exist before main() starts. for example, you can include another tu which happens to launch a thread and call instance(). Singletons used to be a headache before C++11 and it was common(maybe still is) to see macros in projects that expand to a singleton class definition to avoid common pitfalls.by halayli
3/8/2026 at 8:46:43 AM
In fact, Windows 10+ now uses a thread pool during process init well before main is reached.https://web.archive.org/web/20200920132133/https://blogs.bla...
by MaulingMonkey
3/8/2026 at 6:14:08 AM
It's a bit contrived, but a global with a nontrivial constructor can spawn a thread that uses another global, and without synchronization the thread can see an uninitialized or partially initialized value.by platinumrad
3/8/2026 at 7:47:35 AM
[flagged]by jibal