alt.hn

3/31/2025 at 7:36:56 AM

How to Secure Existing C and C++ Software Without Memory Safety [pdf]

https://arxiv.org/abs/2503.21145

by aw1621107

3/31/2025 at 2:35:15 PM

I think this paper overestimates the benefit of what I call isoheaps (partitioning allocations by type). I wrote the WebKit isoheap implementation so it’s something I care about a lot.

Isoheaps can make mostly neutralize use after free bugs. But that’s all they do. Moreover they don’t scale well. If you isoheap a select set of stuff then it’s fine, but if you tried to deploy isoheaps to every allocation you get massive memory overhead (2x or more) and substantial time overhead too. I know because I tried that.

If an attacker finds a type confusion or heap buffer overflow then isoheaps won’t prevent the attacker from controlling heap layout. All it takes is that they can confuse an int with a ptr and game over. If they can read ptr values as ints then they can figure out how the heap is laid out (no matter how weirdly you laid it out). If they also can write ptr values as ints then control they whole heap. At that point it doesn’t even matter if you have control flow integrity.

To defeat attackers you really need some kind of 100% solution where you can prove that the attacker can’t use a bug with one pointer access to control the whole heap.

by pizlonator

3/31/2025 at 6:31:52 PM

Yes, having coded quite a few years in C++ (on the Firefox codebase) before migrating to Rust, I believe that many C and C++ developers mistakenly assume that

1/ memory safety is the unachievable holy grail of safety ;

2/ there is a magic bullet somewhere that can bring about the benefits of memory safety, without any of the costs (real or expected).

In practice, the first assumption is wrong because memory-safety is just where safety _starts_. Once you have memory-safety and type-safety, you can start building stuff. If you have already expended all your cognitive budget on reaching this point, you have lost.

As for the magic bullets, all those I've seen suggested are of the better-than-nothing variety rather than the it-just-works variety they're often touted as. Doesn't mean that there won't ever be a solution, but I'm not holding my breath.

And of course, I've seen people claim more than once that AI will solve code safety & security. So far, that's not quite what's written on the wall.

by Yoric

4/1/2025 at 8:43:09 AM

Well, GC is very close to that magic bullet (comparatively spatial safety via bound checking is easy). It does have some costs of course, especially in a language like C++ that is GC-hostile.

by gpderetta

4/1/2025 at 10:25:07 AM

C++ isn't hostile toward garbage collection — it's more the programmers using C++ who are . C++ is the only language that can have an optional, totally pause-less, concurrent GC engine (SGCL). No other programming language, not even Java, offers such a collector.

by pebal

4/1/2025 at 2:43:39 PM

This is false.

Lots of pauseless concurrent GCs have shipped for other languages. SGCL is not special in that regard. Worse, SGCL hasn’t been shown to actually avoid disruptions to program execution while the shipping concurrent GCs for Java and other languages have been shown to really avoid disruptions.

(I say disruptions, not pauses, because avoiding “pauses” where the GC “stops” your threads is only the first step. Once you tackle that you have to fix cases of the GC forcing the program to take bursts of slow paths on pointer access and allocation.)

SGCL is a toy by comparison to other concurrent GCs. For example is has hilariously complex pointer access costs that serious concurrent GCs avoid.

by pizlonator

4/1/2025 at 6:32:11 PM

There isn’t a single truly pause-less GC for Java — and I’ve already proven that to you before. If such a GC exists for any other language, name it.

And no, SGCL doesn’t introduce slow paths, because mutators never have to synchronize with the GC. Pointer access is completely normal — unlike in other languages that rely on mechanisms like read barriers.

Poluzuj gumkę, serio.

by pebal

4/1/2025 at 7:30:28 PM

> There isn’t a single truly pause-less GC for Java — and I’ve already proven that to you before. If such a GC exists for any other language, name it.

You haven't proven that. If you define "pause" as "the world stops", then no, state of the art concurrent GCs for Java don't have that. If you define "pause" as "some thread might sometimes take a slow path due to memory management" then SGCL has those, as do most memory management implementations (including and especially malloc/free).

> And no, SGCL doesn’t introduce slow paths, because mutators never have to synchronize with the GC. Pointer access is completely normal — unlike in other languages that rely on mechanisms like read barriers.

The best concurrent GCs have no read barriers, only extremely cheap write barriers.

You have allocation slow paths, at the very least.

by pizlonator

4/1/2025 at 8:14:12 PM

First, there are no Java GCs that completely eliminate stop-the-world pauses. ZGC and Shenandoah reduce them to very short, sub-millisecond windows — but they still exist. Even the most concurrent collectors require STW phases for things like root scanning, final marking, or safepoint synchronization. This is documented in OpenJDK sources, benchmarks, and even in Oracle’s own whitepapers. Claiming Java has truly pause-less GC is simply false.

Second, you’re suggesting there are moving GCs that don’t use read barriers and don’t stop mutator threads at all. That’s technically implausible. Moving collectors by definition relocate objects, and unless you stop the world or have some read barrier/hazard indirection, you can’t guarantee pointer correctness during concurrent access. You must synchronize with the mutator somehow — either via stop-the-world, read barriers, or epoch/hazard-based coordination. It’s not magic, it’s basic memory consistency.

SGCL works without moving anything. That’s why it doesn’t need synchronization, read barriers, or even slow-path allocation stalls. That’s not a limitation — that’s a design goal. You can dislike the model, but let’s keep the facts straight.

by pebal

4/1/2025 at 10:35:56 AM

It is hostile in the sense that it allows hiding and masking pointers, so it is hard to have an exact moving GC.

SGCL, as impressive as it is, AFAIK requires pointers to be annotated, which is problematic for memory safety, and I'm not sure that it is a moving GC.

by gpderetta

4/1/2025 at 11:04:17 AM

SGCL introduces the `tracked_ptr` smart pointer, which is used similarly to `shared_ptr`. The collector doesn't move data, which makes it highly efficient and — perhaps surprisingly — more cache-friendly than moving GCs.

by pebal

4/1/2025 at 2:44:36 PM

Based on what data?

Folks who make claims about the cache friendliness of copying GCs have millions of lines of credible test code that they’ve used to demonstrate that claim.

by pizlonator

4/1/2025 at 6:11:54 PM

Compaction doesn't necessarily guarantee cache friendliness. While it does ensure contiguity, object layout can still be arbitrary. True cache performance often depends on the locality of similar objects — for example, memory pools are known for their cache efficiency. It's worth noting that Go deliberately avoids compaction, which suggests there's a trade-off at play.

by pebal

4/1/2025 at 7:31:45 PM

I'm not saying that compaction guarantees cache friendliness.

I'm saying you have no evidence to suggest that not compacting is better for cache friendliness. You haven't presented such evidence.

by pizlonator

4/1/2025 at 8:03:34 PM

As I mentioned earlier, take a look at the Golang. It's newer than Java, yet it uses a non-moving GC. Are you assuming its creators are intentionally making slower this language?

by pebal

4/1/2025 at 4:47:09 PM

I feel that the lack of GC is one of the key differentiators that remain to C++. If a group of C++ developers were to adopt a GC, they'd be well on their way to abandoning C++.

by Yoric

3/31/2025 at 9:10:47 AM

Short paper, so can be easily summarized. The claim is that security can be improved by these compiler and hardware assisted measures:

    - Stack Integrity
    - Control-Flow Integrity
    - Heap Data Integrity
    - Pointer Integrity and Unforgeability
They cite the deployment of these measures on recent Apple hardware as evidence of their effectiveness.

by pjc50

3/31/2025 at 9:39:30 AM

From the cited paper:

"These four types of integrity, do not establish memory safety, but merely attempt to contain the effects of its absence; therefore, attackers will still be able to change software behavior by corrupting memory."

and the paper then goes on to say, about Apple's implementation of the cited techniques:

"This intuition is borne out by experience: in part as a result of Apple’s deployment of these defenses since 2019, the incidence of RCE attacks on Apple client software has decreased significantly—despite strong attack pressure—and the market value of such attacks risen sharply."

"Decreased significantly" is not "eliminated"; indeed, you could paraphrase this as "the combination of these techniques has already been shown to be insufficient for security guarantees".

Which is not to say that these mitigations are a bad idea; but I think their benefits are significantly over-sold in the paper.

by lambdaone

3/31/2025 at 11:34:14 AM

I said this in another comment, but an easy way to measure the efficacy is to look at the economy surrounding the zero day markets. Long story short, over the years it has become increasingly more expensive to both produce/supply and acquire exploits. This is attributable to the increasing mitigations and countermeasures, which include the ones in this document.

by commandersaki

3/31/2025 at 11:52:27 AM

I don't think it is significantly oversold, although it is definitely overselling a bit:

"More disconcertingly, in some cases, attacks may be still be possible despite the above protections."

"We should prepare for a case where these defenses will fail to protect against a specific vulnerability in some specific software".

My main concern with the paper is there is no careful analysis showing that the 4 techniques they propose are really sufficient to cover the majority of RCE exploits. Having said that, I don't dispute that having them would raise the bar for them a lot.

by MattPalmer1086

3/31/2025 at 12:08:41 PM

The memory protection strategies this paper argues for are fine. If we can recompile legacy software to gain better protection against stack and heap exploits that's a clear win.

As the paper points out memory safety is not a great concern on phones because applications are sandboxed. And that's correct. If an application is stuck in a sandbox it doesn't matter what that process does within its own process space. Smartphones taught us what we already knew: process isolation works.

Then the paper observes that memory safety is still a significant problem on the server. But instead of pointing out the root cause -- the absence of sandboxing -- the authors argue that applications should instead be rewritten in go or rust! This is absurd. The kernel already provides strong memory protection guarantees for each process. The kernel also provides hard guarantees for access to devices and the file system. But server software doesn't take advantage of any of these guarantees. When a server process intermixes data of multiple customers and privilege levels then any tiny programming mistake (regardless of memory safety) can result in privilege escalation or catastrophic data leaks. What use is memory safety when your go program returns the wrong user's data because of an off-by-one error? You don't need a root exploit if your process already has "root access" to the database server.

If we want to be serious about writing secure software on the server we have to start taking advantage of the process isolation the kernel provides. The kernel can enforce that a web request from user A cannot return data from user B because the process simply cannot open any files that belong to the wrong user. This completely eliminates all memory safety concerns. But today software on the server emulates what the kernel already does with threading, scheduling, and memory protection, except poorly and in userspace and without any hardware guarantees. Effectively all code runs as root in ring 0. And we're surprised that security continues to plague our industry?

by gizmo

3/31/2025 at 12:31:06 PM

> Then the paper observes that memory safety is still a significant problem on the server. But instead of pointing out the root cause -- the absence of sandboxing -- the authors argue that applications should instead be rewritten in go or rust! This is absurd. The kernel already provides strong memory protection guarantees for each process. The kernel also provides hard guarantees for access to devices and the file system. But server software doesn't take advantage of any of these guarantees. When a server process intermixes data of multiple customers and privilege levels then any tiny programming mistake (regardless of memory safety) can result in privilege escalation or catastrophic data leaks. What use is memory safety when your go program returns the wrong user's data because of an off-by-one error? You don't need a root exploit if your process already has "root access" to the database server.

Yes, because servers are inherently multi-tenant, you can't inherently avoid the risks. Process isolation can't help you, even if you had the resources to fork off for every single request. If you have a database pool in your process, you can go and access other people's data. There is never going to be a case where having an RCE to a server isn't a serious issue.

Also, neither process isolation nor memory safety can guarantee you don't simply return the wrong data for a given customer, so that point really neither here nor there.

(I'd also argue that memory safety clearly matters on mobile platforms still anyway. Many of the exploit chains that break kernel protections still rely on exploiting memory bugs in userland first before they can climb their way up. There's also other risks to this. Getting an RCE into someone's Signal process is an extremely dangerous event for the user.)

by jchw

3/31/2025 at 2:19:30 PM

> Also, neither process isolation nor memory safety can guarantee you don't simply return the wrong data for a given customer, so that point really neither here nor there.

The fact that there can be security issues at the application level is no reason to add memory safety issues to them!

You may well have the resources to fork a process per customer, not have a database pool, not cache etc. its a trade-off with resources needed and performance.

> Also, neither process isolation nor memory safety can guarantee you don't simply return the wrong data for a given customer, so that point really neither here nor there.

You should fix the problems you can, even if there are problems you cannot.

by graemep

3/31/2025 at 2:35:28 PM

> The fact that there can be security issues at the application level is no reason to add memory safety issues to them!

> You may well have the resources to fork a process per customer, not have a database pool, not cache etc. its a trade-off with resources needed and performance.

This feels like one of those weird things that wind up happening in discussions about a lot of things, including say, renewable energy. People get so hung up about certain details that the discussion winds up going into a direction that really doesn't make sense, and it could only go there incrementally because it would've been very obvious if you followed the entire chain.

Please bear with me. Consider the primary problem in the first place:

The problem is that we have a bunch of things that are not memory safe, already. Rewriting all of those things is going to take a lot of time and effort.

All of these ideas about isolating requests are totally fine, honestly. If you can afford those sorts of trade-offs in your design, then go hog wild. I find many of these architectural decisions to be a bit pointless for reasons I'm happy to dig into in more detail if people really care, but the important thing to understand is that I'm not telling you to not try to isolate requests. I'm just saying that in practice, there are no "legacy" servers that apply this degree of request isolation. They were architected with the assumption that the process would not get RCE'd, and therefore it would require a full rearchitecting to make them safe in the face of that sort of bug.

But when we talk about architectures where we avoid connection pooling and enforce security policy entirely in the database layer, that essentially means writing new servers. And if you're writing new servers, why in the world would you go through all of this effort when you don't have to? I'll grant you that the Rust borrow checker has a decently steep learning curve, but I don't expect it to be a serious barrier to experienced C++ programmers if they are really trying to learn and not just looking for reasons to not have to. C++ is not an easy language, either.

It would be absolutely insane to perform all of this per-request isolation voodoo and then not start on a clean slate with a memory safe language in the first place. You may as well take both at that point. Fuck it, run each request inside a small KVM context! And in fact, you can do that. Microsoft did that[1], but note that even in their press release for Hyperlight, they are showing it being used in Rust code, because that's the logical order of operations: If you're truly serious about security, and you're building something from scratch, start by preventing vulnerabilities as far left in the process as possible; and you can't go further left than having a programming language that can prevent certain bugs entirely by-design.

> You should fix the problems you can, even if there are problems you cannot.

That goes both ways.

[1]: https://opensource.microsoft.com/blog/2024/11/07/introducing...

by jchw

3/31/2025 at 2:56:33 PM

> All of these ideas about isolating requests are totally fine, honestly. If you can afford those sorts of trade-offs in your design, then go hog wild.

I think far fewer people make some of these trade offs than could afford to.

I think there is a more fundamental problem with databases and data in general. A lot of what servers do involves shared data - think of the typical CRUD app where multiple people can read or modify the same data, and that is the valuable data.

> If you're truly serious about security, and you're building something from scratch, start by preventing vulnerabilities as far left in the process as possible; and you can't go further left than having a programming language that can prevent certain bugs entirely by-design.

Agreed.

by graemep

3/31/2025 at 12:54:00 PM

Databases also have access controls! Which developers don't use.

If you have 10.000 tenants on the same server you can simply have 10.000 database users. And that simple precaution will provide nearly perfect protection against cross-tenant data leaks.

by gizmo

3/31/2025 at 1:18:21 PM

The database was merely an example, there are other shared resources that are going to run into the same problem, like caches. Still, it's weird to imply that database users are meant to map to application users without any kind of justification other than "you can do it" but OK, let's assume that we can. Let's consider Postgres, which will have absolutely no problem creating 10,000 users (not sure about 100,000 or 1,000,000, but we can put that aside anyhow.) You basically have two potential options here:

- The most logical option is to continue to use database pooling as you currently do, and authenticate as a single user. Then, when handling a user request, you can impersonate a specific database user. Only problem is, if you do this, the protection you get is entirely discretionary: the connection is still absolutely authenticated to a user that can do more, and all you have to do is "reset role" and go on your way. So you can do this, but it doesn't help you with server exploits.

- The other way to handle this is by having each request get a separate database connection which is actually authenticated to a specific user. That will work and provide the database level guarantees. However, for obvious reasons, you definitely can't share a global database pool with this approach. That's a problem, because each Postgres connection will cost 5-10 MiB or so. If you had 10,000 active users, you would spend 50-100 GiB on just per-connection resources on your database server box. This solution scales horribly even when the scale isn't that crazy.

And this is all assuming you can actually get the guarantees you need just using the database layer. To that I say, good luck. You'll basically need to do most of your logic in the database instead of the application layer and make use of features like row-level security. You can do this, but it's an extremely limiting architecture, not the least of which because databases are hard to scale except vertically. If you run into any scenario where you outgrow a single database cluster, everything here goes out the window.

Needless to say, nobody does this, and they're totally right. Having a database assist in things like authorization and visibility is not a terrible idea or anything, but all of this taken together is just not very persuasive.

And besides. Postgres itself, along with basically all of the other major databases, are also not necessarily memory-safe. Having external parties have access to your database connection pretty much puts you back at square one for defenses against potential unknown memory safety bugs, making this entire exercise a bit pointless...

by jchw

3/31/2025 at 3:06:49 PM

Most server software today intermingles data that should be strictly isolated because it's convenient. I don't buy your arguments about efficiency either, because despite having no data isolation web software is still comically (tragically) slow.

by gizmo

3/31/2025 at 3:29:52 PM

Three things, very briefly:

- Scaling is different than efficiency.

- Even if you can afford to eat CPU costs, memory limitations are inflexible.

- Again, this moves the problem to the database. The database is still written in C, and still deals with multi-tenant workloads in the same process, and you assume your application server might get RCE'd, which means now someone has a connection to your database, already authenticated, and it may very well have memory safety or other bugs. You can't escape this problem by moving it.

by jchw

3/31/2025 at 4:18:36 PM

The kernel knows about system-local users, but not the remote ones. Servers may need to access data of multiple users at once, so it's not as simple as some setuid+chroot CGI for every cookie received. Kernels like Linux are not designed for that.

Maybe it would be more feasible with some capability-based kernel, but you'd inherently have a lot of logic around user accounts, privileges, and queries. You end up involving the kernel in what is row-level database security. That adds a lot of complexity to the kernel, which also makes the isolation itself have more of the attack surface.

OTOH you can write your logic in a memory-safe language today. The VM/runtime/guaranteed-safe-subset is your "kernel" that protects the process from getting hijacked — an off-by-one error can't cause arbitrary code execution. The VM/runtime itself can still have vulnerabilities, but that just becomes analogous to kernel vulnerabilities.

by pornel

3/31/2025 at 8:20:13 PM

> That adds a lot of complexity to the kernel, which also makes the isolation itself have more of the attack surface.

Not if you remove auth from the kernel: https://doc.cat-v.org/plan_9/4th_edition/papers/auth The Plan 9 kernel is very small and portable which demonstrates that you don't need complexity to do distributed auth properly. The current OS hegemony is incredibly dated design wise because their kernels were all designed to run on a single machine.

> OTOH you can write your logic in a memory-safe language today.

Memory safety is not security.

by MisterTea

4/1/2025 at 3:01:07 AM

> Not if you remove auth from the kernel

The factoctum looks very much like a microservice or a database with stored procedures handling access control, but of course plan9 makes it a file system instead of some RPC. It's a sensible design, but if IPC is the solution, then you don't even need plan9 for it.

> Memory safety is not security.

I didn't say it was. However, it is an isolation barrier for the memory-safe code. It's roughly equivalent to process isolation, but in userland. Instead of an MMU you have bounds checks in software.

Kernels implement process isolation cheaply with the help of hardware, but that isn't the only way to achieve the same effect. It can be emulated in software. When the code is memory safe, it can't be made to execute arbitrary logic that isn't in the program's code. If the program attempts some out-of-bounds access, it will be caught with userland checks instead of a page fault, but in either case it won't end up with an illegal memory access.

by pornel

4/1/2025 at 5:15:38 AM

> but of course plan9 makes it a file system instead of some RPC.

Actually, Plan 9 does IPC via 9P, the RPC based file system protocol. The protocol serves a tree of named byte addressable objects which are composable per process. The Plan 9 kernel is a VFS multiplexer. It *only* speaks 9P. All disk file systems,e.g. ext4, are served via user space micro services the bell labs people called file servers. Unlike clunky Unix and it's copies there are no special files or character files nor ioctl(). It's all via the foundational concept of how people organize resources into "files" and folders (directories) via path names. All this is transparent over the network by default.

The reality is the OS is a very portable light weight channel based container host, the container being the process. Each with it's own namespace which means it's own collection of mounted resources composed of 9P mounts and binds organized into a tree of named objects. Those objects are protected by Unix-like permissions: user/group/everyone-RWX served from as m yet another micro service using the same protocol. A process can rfork() more with flags to share resources and control it's file system view to only what the child container needs to see. Those containers then fork off more. You can keep firing off boxes with CPU and RAM and whatever what is hanging off via pxe and instantly have access to that compute and resources. 9P is architecture independent so file servers running on arm are no different than any other arch like x86, mips, risc-v, etc. anyone can mount any other hardware. It's a light weight cloud ready micro service host that was stated in the 80's by the same people who made Unix and c. It's friggin wild.

I highly encourage people try to really understand how it works. It's pretty damn eye opening and refreshing. It sorta blew my mind when I saw the process as the container that can fork off more and more with the ability to control the system view of each one. And you can understand the code. Like all of it.

by MisterTea

3/31/2025 at 6:14:34 PM

> Maybe it would be more feasible with some capability-based kernel, but you'd inherently have a lot of logic around user accounts, privileges, and queries. You end up involving the kernel in what is row-level database security. That adds a lot of complexity to the kernel, which also makes the isolation itself have more of the attack surface.

Microkernels/exokernels sacrificing some performance to bring reliable kernels that allow for reliable userspace.

by vacuity

3/31/2025 at 12:49:28 PM

> doesn't matter what that process does within its own process space

Re. phones -- you assume that a process hacks another process. But a there might be vulnerability within the process itself, corrupting its own memory. Sandboxing doesn't help.

by deepsun

3/31/2025 at 1:06:36 PM

What matters is that a random app cannot access sensitive data like your passwords, sessions, email. On iOS you can run anything from the app store and it's fine. On Windows any .exe you run can cause havoc.

by gizmo

3/31/2025 at 1:24:18 PM

My point that memory corruption can happen in the password app itself, not in a random app.

by deepsun

3/31/2025 at 2:53:11 PM

Who cares if my password app has memory corruption? There are no negative consequences. Worst case I have to re-download my password database file.

by gizmo

3/31/2025 at 12:26:11 PM

> memory safety is not a great concern on phones because applications are sandboxed. And that's correct. If an application is stuck in a sandbox it doesn't matter what that process does within its own process space. Smartphones taught us what we already knew: process isolation works.

I thought we learned that this doesn't work after the iOS 0-click exploit chain running "sandboxed" app code in the kernel.

by VWWHFSfQ

3/31/2025 at 12:51:06 PM

Good job sandbox escapes and local root exploits never exist!

by IshKebab

3/31/2025 at 12:59:11 PM

1. The point is you don't need root when the unprivileged process already has access to all data. There is nothing more to be gained.

2. Good luck breaking out of your AWS instance into the hypervisor.

by gizmo

3/31/2025 at 2:38:14 PM

> 2. Good luck breaking out of your AWS instance into the hypervisor.

It doesn't take luck. Just skill, resources, and motivation. Like we already saw happen to the iOS "sandbox".

by VWWHFSfQ

3/31/2025 at 9:26:43 AM

> For high assurance, these foundations must be rewritten in memory- safe languages like Go and Rust [10]; however, history and estimates suggest this will take a decade or more [31].

The world runs on legacy code. CISA is correct that rewrites are needed for critical software [1][2] but we know how rewrites tend to go, and ROI on a rewrite is zero for most software, so it will take far more than a decade if it happens at all. So score one for pragmatism with this paper! Hope CISA folks see it and update their guidance.

[1] https://www.cisa.gov/news-events/news/urgent-need-memory-saf... [2] https://www.cisa.gov/resources-tools/resources/case-memory-s...

by cadamsdotcom

3/31/2025 at 11:57:31 AM

I don't see what updates you expect on CISA guidance because the very documents you reference already acknowledge that rewriting all code is not a viable strategy in the general case and that other options for improvement exist

For example, they recommend evaluating hardware backed solutions such as CHERI

> There are, however, a few areas that every software company should investigate. First, there are some promising memory safety mitigations in hardware. The Capability Hardware Enhanced RISC Instructions (CHERI ) research project uses modified processors to give memory unsafe languages like C and C++ protection against many widely exploited vulnerabilities. Another hardware assisted technology comes in the form of memory tagging extensions (MTE) that are available in some systems. While some of these hardware-based mitigations are still making the journey from research to shipping products, many observers believe they will become important parts of an overall strategy to eliminate memory safety vulnerabilities.

And acknowledge that strategies will need to be adjusted for every case

> Different products will require different investment strategies to mitigate memory unsafe code. The balance between C/C++ mitigations, hardware mitigations, and memory safe programming languages may even differ between products from the same company. No one approach will solve all problems for all products.

However, and that's where the meat of the papers is: They require you to acknowledge that there is a problem and do something about it

> The one thing software manufacturers cannot do, however, is ignore the problem. The software industry must not kick the can down the road another decade through inaction.

and the least you can do is make it a priority and make a plan

> CISA urges software manufacturers to make it a top-level company goal to reduce and eventually eliminate memory safety vulnerabilities from their product lines. To demonstrate such a commitment, companies can publish a “memory safety roadmap” that includes information about how they are modifying their software development lifecycle (SDLC) to accomplish this goal.

It's clearly not the case that these papers say "Rewrite all in Rust, now!". They do strongly advocate in favor of using memory safe languages for future development and I believe that's the rational stance to take, but they appear well groundet in their stance on existing software.

by Xylakant

3/31/2025 at 12:53:21 PM

Google has shown that you get the biggest benefit by writing new code in memory-safe languages, so it's not like security doesn't drastically improve until you've rewritten everything.

by IshKebab

3/31/2025 at 11:06:47 AM

I don't know if the ROI on rewrites is zero, following Prossimo's work, I'm seeing lots of performance and maintainability improvements over existing software.

by berratype

3/31/2025 at 11:32:09 AM

> Hope CISA folks see it and update their guidance.

cope > hope @ CISA right now:

https://www.darkreading.com/cyberattacks-data-breaches/cisa-...

by DyslexicAtheist

3/31/2025 at 1:15:10 PM

I wish HN readers were not so afraid to openly discuss this more, but it is a much bigger issue than even what that article lets on (which is a good article, I might add). Cuts to critical agencies like this aren't about efficiency at all; it's about ham-stringing roadblocks that might be in your way. That means we can expect one of two things in the near future of the US;

1. Enemies of the US gov't exploit the weakness, possibly conspiring with the people who created the weakness to do so.

2. A near or complete collapse of the government as, lo and behold, it is discovered that none of them actually knew what they were doing, regardless of the confidence in which they said otherwise.

Either way, we, the people trying to keep industries moving and bring home food to put on the table, will suffer.

by 0xEF

3/31/2025 at 10:11:12 AM

There is actually an interesting niche that one can carve out when dealing with an attacker who has a memory corruption primitive but this paper is a bit too simple to explore that space. Preventing RCE is too broad of a goal; attackers on the platforms listed continue to bypass implementations of the mitigations presented and achieve some form of RCE. The paper suggests these are because of implementation issues, and some are clearly bugs in the implementation, but many are actually completely novel and unaddressed workarounds that require a redesign of the mitigation itself. For example, “heap isolation” can be done by moving allocations away from each other such that a linear overflow will run into a guard page and trap. Is it an implementation bug or a fundamental problem that an attacker can then poke bytes directly into a target allocation rather than linearly overwriting things? Control flow integrity has been implemented but attackers then find that, in a large application, calling whole functions in a sequence can lead to the results they want. Is this a problem with CFI or that specific implementation of CFI? One of the reasons that memory safety is useful is that it’s a lot easier to agree on what it is and how to achieve it, and with that what security properties it should have. Defining the security properties of mitigations is quite a bit harder. That isn’t to say that they’re not useful, or can’t be analyzed, but generally the result is not actually denial of RCE.

by saagarjha

3/31/2025 at 9:12:41 AM

This looks really useful. Doesn't fix the problem of memory corruption but mostly seems to limit the ability to convert that into remote code execution. And all the techniques are already in widespread use, just not the default or used together.

I would not be surprised if attackers still manage to find sneaky ways to bypass all the 4 protections, but it would certainly raise the bar significantly.

by MattPalmer1086

3/31/2025 at 9:55:06 AM

It's already been demonstrated to be insufficient; otherwise Apple software would now be impregnable. That's not to say these protections are a bad idea; they should be universal as they substantially reduce the existing attack surface - but the paper massively over-sells them as a panacea.

by lambdaone

3/31/2025 at 11:04:34 AM

Most server software these days is already written in memory-safe languages yet still has security vulnerabilities. If you reduce the vulnerabilities due to memory safety issues by, say, 90%, then there's nothing special about them anymore. On the other hand, memory-safe languages also don't fully eliminate memory safety issues as they depend on operations or components that may not themselves be fully memory-safe (BTW, Rust has this problem more than other memory-safe languages).

So a "panacea" in this context doesn't mean making the stack memory-safe (which nobody does, anyway) but rather making the vulnerabilities due to memory safety no more common or dangerous than other causes of vulnerabilities (and remember that rewriting software in a new language also introduces new vulnerability risks even when it reduces others). Whether these techniques actually accomplish that or not is another matter (the paper makes empirical claims without much evidence, just as some Rust fans do).

by pron

3/31/2025 at 12:07:21 PM

> BTW, Rust has this problem more than other memory-safe languages

Do you have a citation there?

I've run into a ton of memory safety issues in Go because two goroutines concurrently modifying a map or pointer is a data-race, which leads to memory unsafety... and go makes it wildly easy to write such data-races. You need to manually add mutexes everywhere for go to be memory safe.

Rust, on the other hand, I've yet to run into an actual memory safety issue. Like, I'm at several hundred in Go, and 0 in rust.

I'm curious why my experience is so different.

by TheDong

3/31/2025 at 4:48:23 PM

Rust compiler assumes that unsafe code still follows the borrow rules and optimizes code based on that. So if that is not the case, the consequences can be bad. For example, there was a bug in Rust OpenSSL bindings where OpenSSL had not documented the proper life time of some of the function arguments/return types. As the result unsafe code containing the call to OpenSSL C implementation violated Rust memory model leading to crashes.

But I agree that such problems are rare in Rust and Go races are much more problematic.

by fpoling

3/31/2025 at 12:24:10 PM

It is more common (and/or more necessary) to either write unsafe code or call out to unsafe code written in some other language in Rust than in Java.

BTW, I wouldn't call a language that violates memory safety without an explicit unsafe operation a memory-safe language.

by pron

3/31/2025 at 12:15:48 PM

I think the key is "as they depend on operations or components that may not themselves be fully memory-safe" indicating Rust is used a far bit more to do interop with memory unsafe languages unlike Go or Java.

by commandersaki

3/31/2025 at 12:18:30 PM

I sense OP's referring to raw pointer operations permitted by the unsafe dialect of the language.

by meltyness

4/1/2025 at 4:58:24 AM

what vulnerabilities does rust introduce over c/c++?

by kobebrookskC3

3/31/2025 at 3:06:40 PM

Was discussing this paper with a few colleagues who work in this area, and concluded that this paper seems like an odd combination of:

- The author citing their own research. (Ok, all researchers do this) - Mildly scolding the industry for not having applied their research. It's "pragmatic" after all.

The elephant in the room is that these approaches have been widely deployed and their track record is pretty questionable. iPhone widely deploys PAC and kalloc_type. Chrome applies CFI and PartitionAlloc. Android applies CFI and Scudo. Yet memory safety exploitation still regularly occurs against these targets. Is it harder because of these technologies? Probably. But if they're so effective, why are attackers still regularly successful at exploiting memory safety bugs? And what's the cost of applying these? Does my phone's battery die sooner? Is it slower? So now your phone/browser are slower AND still exploitable.

by linux_security

3/31/2025 at 9:23:28 AM

>A Pragmatic Security Goal

>Remote Code Execution (RCE) attacks where attackers exploit memory-corruption bugs to achieve complete control are a very important class of potentially-devastating attacks. Such attacks can be hugely disruptive, even simply in the effects and economic cost of their remediation [26]. Furthermore, the risk of such attacks is of special, critical concern for server-side platform foundations [10]. Greatly reducing the risk of RCE attacks in C and C++ software, despite the presence of memory-corruption bugs, would be a valuable milestone in software security especially if such attacks could be almost completely prevented. We can, therefore, aim for the ambitious, pragmatic goal of preventing most, or nearly all, possibilities of RCE attacks in existing C and C++ software without memory safety. Given the urgency of the situation, we should only consider existing, practical security mechanisms that can be rapidly deployed at scale.

I don't know if it's obvious to anyone else that this is AI-written or if it's just me/if I'm mistaken

by sebstefan

3/31/2025 at 9:32:08 AM

I am not sure, and it may be this persons culture/background, but I do know that at a college/uni, your advisors/reviewers would tell you not to do the adjective/drama stuff, as it adds no real value to a scientific/technical paper.

e.g. potentially-devastating, hugely disruptive, special critical, greatly reducing, valuable milestone, almost completely, ambitious pragmatic, most or nearly all, existing practical.

by readingnews

3/31/2025 at 9:29:50 AM

It’s not obvious to me. I cannot say one way or the other

by dgellow

3/31/2025 at 6:25:59 PM

I've always favored a large public/private investment into open-source tools like Coverity, PVS Check, and RV-Match. Put extra effort into suppressing false positives and autofixing simple problems. Companies like Apple had enough money to straight up buy the vendors of these tools.

I'd also say, like CPAChecker and Why3, they should be designed in a flexible way where different languages can easily be added. Also, new passes for analyzers. Then, just keep running it on all the C/C++ code in low, false, positive mode.

On top of this, there have been techniques to efficiently do total memory safety. Softbound + CETS was an example. We should invest in more of those techniques. Then, combine the analyzers with those tools to only do runtime checks on what couldn't be proven.

by nickpsecurity

3/31/2025 at 9:48:51 AM

> However, their use is the exception, not the rule, and their use—in particular in combination—requires security expertise and investment that is not common. For them to provide real-world, large-scale improvements in the security outcomes of using C and C++ software, there remains significant work to be done. In particular, to provide security benefits at scale, for most software, these protections must be made an integral, easy-to-use part of the world-wide software development lifecycle. This is a big change and will require a team effort.

That's the core problem.

The mechanisms mentioned are primarily attack detection and mitigation techniques rather than prevention mechanisms. Bugs can't be exploited as easily, but they still exist in the codebase. We're essentially continuing to ship faulty software while hoping that tooling will protect us from the worst consequences.

Couldn't one argue that containers and virtual machines also protect us from exploiting some of these memory safety bugs? They provide isolation boundaries that limit the impact of exploits, yet we still consider them insufficient alone.

It's definitely a step in the right direction, though.

The paper mentions Rust, so I wanted to highlight a few reasons why we still need it for people who might mistakenly think this approach makes Rust unnecessary:

  - Rust's ownership system prevents memory safety issues at compile time rather than trying to mitigate their effects at runtime  
  - Rust completely eliminates null pointer dereferencing  
  - Rust prevents data races in concurrent code, which the paper's approach doesn't address at all  
  - Automatic bounds checking for all array and collection accesses prevent buffer overflows by design  
  - Lifetimes ensure pointers are never dangling, unlike the paper's approach which merely tries to make dangling pointers harder to exploit
So, we still need Rust, and we should continue migrating more code to it (and similar languages that might emerge in the future). The big idea is to shift bug detection to the left: from production to development.

by mre

3/31/2025 at 10:39:33 AM

We're essentially continuing to ship faulty software while hoping that tooling will protect us from the worst consequences.

Yet one way to measure how effective these mitigations and countermeasures are working is looking at the cost of the zero day market. The trend continues to going upwards in the stupidly expensive realm due to needing multiple chains and such to attack software. However, I'm not discounting software now developed in memory safe language doesn't already contribute to this.

Here is one of the references indicating this in the article: https://techcrunch.com/2024/04/06/price-of-zero-day-exploits...

by commandersaki

3/31/2025 at 10:32:11 AM

Or Java, or Scala, or Go, or whathaveyou. This is about existing software.

by tgv

3/31/2025 at 2:32:30 PM

While all of the languages you mention are memory safe (as is almost every programming language released after 1990), none of them solve all of the safety problems mentioned above, in particular, the two points:

  - Rust completely eliminates null pointer dereferencing  
  - Rust prevents data races in concurrent code, which the paper's approach doesn't address at all  
Scala comes closest to solving these points, since it has optional features (in Scala 3) to enable null safety or you could build some castles in the sky (with Scala 2) to avoid using null and make NPEs more unlikely. The same goes for concurrency bugs: you can use alternative concurrency models that make data races harder (e.g. encapsulate all your state inside Akka actors).

With Go and Java there is no dice. These languages lack the expressive power (by design! since they are touted as "simple" languages) to do anything that will greatly reduce these types of bugs, without resorting external tools (e.g. Java static analyzers + annotation or race condition checkers).

In short, Rust is one of the only mainstream languages that absolutely guarantees safety from race conditions, and complete null safety (most other languages that provide good null safety mechanisms like C#, Kotlin and TypeScript are unsound due to reliance on underlying legacy platforms).

by unscaled

3/31/2025 at 8:27:23 PM

Nil dereferencing in those languages doesn’t make them unsafe. It throws an exception or panics. And Java has some none-nil annotation, IIRC.

Still, none of this is relevant to existing software written in C. This is not about a rewrite.

And if it were, rust doesn’t offer perfect safety, as many tasks almost demand unsafe code. Whereas that doesn’t happen in Go, Scala, etc. Every situation requires its own approach.

by tgv

3/31/2025 at 4:31:48 PM

How is NPE a safety issue?

by eklavya

3/31/2025 at 12:21:00 PM

> We're essentially continuing to ship faulty software while hoping that tooling will protect us from the worst consequences.

It is the consequences that give rise to the cost of a bug or the value of preventing it.

> Couldn't one argue that containers and virtual machines also protect us from exploiting some of these memory safety bugs? They provide isolation boundaries that limit the impact of exploits, yet we still consider them insufficient alone.

No, that's not the same because sandboxing only limits the impact of vulnerabilities to whatever the program is allowed to do in the first place, and that is, indeed, insufficient. The mechanisms here reduce the impact of vulnerabilities to less than what the program is allowed to do. To what extent they succeed is another matter, but the two are not at all comparable.

> The big idea is to shift bug detection to the left: from production to development.

This has been the idea behind automated tests since the practice first gained popularity. But it's important to understand that it works due to a complicated calculus of costs. In principle, there are ways to eliminate virtually all bugs with various formal methods, yet no one is proposing that this should be the primary approach in most situations because despite "shifting left" it is not cost effective.

Everyone may pick their language based on their aesthetic preference and attraction to certain features, but we should avoid sweeping statements about software correctness and the best means to achieve it. Once there are costs involved (and all languages that prevent certain classes of bugs exact some cost) the cost/benefit calculus becomes very complex, with lots of variables. Practical software correctness is an extremely complicated topic, and it's rare we can make sweeping universal statements. What's important is to obtain empirical data and study it carefully.

For example, in the 1970s, the prediction by the relevant experts was that software would not scale without formal proof. Twenty years later, those predictions were proven wrong [1] as unsound (i.e. without absolute guarantees) software development techniques, such as code review and automated tests, proved far more effective than anticipated, while sound techniques proved much harder to scale beyond a relatively narrow class of program properties.

Note that this paper also makes some empirical claims without much evidence, so I take its claims about the effectiveness of these approaches with the same scepticism as I do the claims about the effectiveness of Rust's approach.

[1]: https://6826.csail.mit.edu/2020/papers/noproof.pdf

by pron

3/31/2025 at 1:38:07 PM

> Everyone may pick their language based on their aesthetic preference and attraction to certain features, but we should avoid sweeping statements about software correctness and the best means to achieve it. Once there are costs involved (and all languages that prevent certain classes of bugs exact some cost) the cost/benefit calculus becomes very complex, with lots of variables. Practical software correctness is an extremely complicated topic, and it's rare we can make sweeping universal statements.

Thank you. I feel like this perspective is forever being lost in these discussions -- as if gaining the highest possible level of assurance with respect to security in a critical system were a simple matter of choosing a "safe language" or flipping some switch. Or conversely, avoiding languages that are "unsafe."

It is never this simple. Never. And when engineers start talking this way in particular circumstances, I begin to wonder if they really understand the problem at hand.

by sramsay

4/1/2025 at 5:07:55 AM

does making software scale? i see exploits for android/ios even though they spend millions if not billions on securing it. which unsound techniques make exploits unfeasible? i'm not even looking for a guarantee, just an absence in practice.

by kobebrookskC3

4/1/2025 at 11:55:59 AM

Well, software today is much bigger and of higher quality that was thought possible in the seventies. That's not to say that it can scale indefinitely, but my point was that unsound methodologies (i.e. not formal proofs) work much better than expected, and the software correctness world has moved from the seventies' "soundness is the only way" to "software correctness is a complex game of costs and benefits, a combination of sound and unsound techniques is needed, and we don't know of an approach that is universally better than others; there are too many variables".

by pron

3/31/2025 at 2:59:39 PM

OT: Are there any memory safe languages that that are fast and support goto?

I'm writing something that needs to implement some tax computations and I want to implement them to follow as closely as possible the forms that are used to report those computations to the government. That way it is easy to be sure they are correct and easy to update them if the rules change.

The way those forms work is something like this:

  1. Enter your Foo: _________
  2. Enter your Bar: _________
  3. Add line 1 and line 2: ________
  4: Enter your Spam: _______
  5: Enter the smaller of line 1 and 4: _____
  6: If line 5 is less than $1000 skip to line 9
  7: Enter the smaller of line 2 and $5000: _____
  8: If line 7 is greater than line 4 skip to 13
  ...
With goto you can write code that exactly follows the form:

  Line1: L1 = Foo;
  Line2: L2 = Bar;
  Line3: L3 = L1 + L2;
  Line4: L4 = Spam;
  Line5: L5 = min(L1, L4);
  Line6: if (L5 < 1000) goto Line9;
  Line7: L6 = min(L2, 5000);
  Line8: if (L7 > L4) goto Line13;
  ...
For some forms an

  if (X) goto Y
    ....
  Y:
can be replaced by

  if (!X) {
     ...
  }
because nothing before that has a goto into the body of the if statement. But some forms do have things jumping into places like that. Also jumping out of what would be such a body into the body of something later.

Writing those without goto tends to require duplicating code. The duplication in the source code could be eliminated with a macro system but don't most memory safe languages also frown on macro systems?

Putting the duplicate code in separate functions could also work but often those sections of code refer to things earlier in the form so some of the functions might need a lot of arguments. However the code then doesn't look much like the paper form so it is harder to see that it is correct or to update it when the form changes in different years.

by tzs

4/1/2025 at 3:26:56 AM

Rust has macros. It also has labeled blocks that you can break out of, which are similar to goto except with more nesting required. You could plausibly reduce the nesting with a macro though.

In most languages I'd just solve this by ending each block with a call to the next block. The "too many arguments required" problem can be addressed with closures.

by ameliaquining

3/31/2025 at 6:12:39 PM

> OT: Are there any memory safe languages that that are fast and support goto?

Irreducible control flow is a pain for static analysis.

> Writing those without goto tends to require duplicating code

This feels like a regular old state machine to me, which obviously is nice to write with goto, but isn't required.

by steveklabnik

3/31/2025 at 3:07:06 PM

Maybe you could build a domain-specific language.

by returningfory2

4/1/2025 at 9:02:03 AM

This paper would have been really compelling in 2005-2010, but in 2025 there's too much evidence that these approaches do not result in C++ that is secure. The author cites a number of projects that have broadly applied these techniques, like Chrome and iOS, but these code bases continue to be exploited regularly despite these protections. If you actually look at where those projects are investing, it's on moving to Rust/Swift.

by cartalk

3/31/2025 at 4:48:32 PM

The paper significantly overstates the scope of PAC (pointer authentication codes) on Apple platforms. To quote the paper:

> This is effectively what is done in Apple software, which uses special ARM hardware support to also check pointer integrity at runtime—i.e., ensure each pointer access uses pointers of the right type[]. Apple uses this further to enforce a form of stack integrity, control-flow integrity, and heap integrity

In reality, the compiler only automatically applies PAC to code pointers: stack return addresses, function pointers, and C++ vtable method pointers. You can also manually apply PAC to other pointers using attributes and intrinsics; this is used by components like the Objective-C runtime and the memory allocator. But only a tiny amount of code does this.

PAC is nevertheless a powerful mitigation. But let’s see how it stacks up against the paper’s claims:

- Heap integrity:

Since Apple’s implementation of PAC leaves most data pointers unsigned, it has little direct relevance to heap integrity. The paper seems to want to sign all pointers. You could theoretically implement a compiler feature to use PAC instructions for all pointers, but I don’t think anyone (not just Apple) has implemented such a thing. It would probably come with high performance and compatibility costs.

- Stack integrity:

The paper defines this as “attackers cannot change the arguments, local variables, or return sites”. PAC makes it difficult to change return sites, but does nothing to prevent changing arguments and local variables (unless you're limited to a linear overwrite). Again, it's theoretically possible to use PAC instructions to secure those things: there is a technique to make a single signature that combines multiple pointer-sized values, so you could try to make one signature that covers the whole set of local variables and other stack bits. But nobody does this, so the compatibility and performance costs are unknown. Even SafeStack (which the paper also cites) does not fully protect local variables, though it gets closer.

- Control-flow integrity:

The paper mentions “type signatures”, but Apple’s PAC-based CFI does not validate type signatures for C function pointers, only vtables and Objective-C isa pointers. Other CFI implementations do validate C function pointer type signatures, like Android, though this seems to come at the cost of a slower pace of adoption.

More importantly, attackers have demonstrated the ability to get around CFI, by substituting valid but irrelevant function pointers to achieve “jump-oriented programming” (JOP). Project Zero recently published a blog post explaining a (semi-recent) iOS exploit that used this technique:

https://googleprojectzero.blogspot.com/2025/03/blasting-past...

I’m not sure whether type signature validation would have prevented this particular exploit, but many C function pointers have pretty simple signatures (yet may do wildly different things), so the benefit is somewhat limited.

by comex

3/31/2025 at 11:23:04 AM

Not to be harsh, but this article is messy, and I find myself agreeing with the commencts regarding hyperboles and AI-writing. It's a directionless nothingburger that hasn't really made any efforts to look into recent research on memory allocation and hardening.

Some examples:

In the heap-section, the article only cites ±20 year old papers and mixes user- and kernel-space allocators such as Apple's kernel kalloc_type and one for Chrome. Not to mention that the author talks about implemention heap regions per-object as if that wont introduce significant memory and performance overhead and is completely unrealistic in a comodity OS, let alone in a server setting.

The pointer integrity section ends with the statement "The above protections, in combination, can prevent complete software control by attackers able to corrupt memory." - which is not true (see recent CVE i found after a quick search [1]), even for relativly hardened Apple products that use pointer authentication and a type segregated allocator kalloc_type.

Additionally, literally the next sentence in the subsequent section contradicts the previous statement (!): "...therefore, attackers will still be able to change software behavior by corrupting memory". Really dampens the credibility.

If anyone is interested in some recent contributions in the space, I've been looking into these recently (in no particular order): SeaK [2], ViK[3] and Safeslab [4].

[1] https://www.cvedetails.com/cve/CVE-2025-24085/ [2] https://www.usenix.org/conference/usenixsecurity24/presentat... [3] https://dl.acm.org/doi/10.1145/3503222.3507780 [4] https://dl.acm.org/doi/10.1145/3658644.3670279

by xianga

3/31/2025 at 1:24:13 PM

commenting before reading it but I guess memory arena???

by tonyhart7

3/31/2025 at 9:09:01 AM

Looks like they're finally coming to terms with C++'s flaws.

by m00dy

3/31/2025 at 9:25:20 AM

"they"

by michaelsshaw

3/31/2025 at 10:09:34 AM

C and C++ code is paradigmatic in being susceptible to CLI security vulnerabilities.

Object-oriented languages typically work in the set A-Z, with limited characters, parameters, etc...

Whereas Wittgenstein's concept of the private language is internal and discursive in Skinnerite probabalistic capacities.

by awaymazdacx5