4/19/2026 at 6:52:05 AM
> Apple Silicon changes the physics. The CPU and GPU share the same physical memory (Apple's Unified Memory Architecture) ... no bus!Beware the reality distortion field: This is of course how it's worked on most x86 machines for a long time. And also on most Macs when they were using Intel chips.
by fulafel
4/19/2026 at 7:00:24 AM
Why did all my x86 onboard iGPU reserve a fixed amount of RAM on boot, inaccessible to the OS? Why do dGPU bring their own VRAM and how to directly manipulate it from the CPU without copying?by littlecranky67
4/19/2026 at 10:02:00 AM
Correct me if I'm wrong, but that reserved memory is for the framebuffer? The iBoot bootloader also reserves some memory for the framebuffer.dGPUs bring their own VRAM because it's a different type of memory, allowing them to get higher performance than they could with DDR. The M4 Max requires 128GB of LPDDR5X to reach its ~500GB/s bandwidth. The RX Vega 64 had that same bandwidth in 2017 with just 8GB of HBM2.
by ben-schaaf
4/19/2026 at 10:51:42 AM
Nope, the reserved memory is what's available to use from the various APIs (VK, GL, etc). More recently there's OS support for flexible on demand allocation by the GPU driver.Of course the APIs have allowed you to make direct use of pointers to CPU memory for something like a decade. However that requires maintaining two separate code paths because doing so while running on a dGPU is _extremely_ expensive.
by fc417fc802
4/19/2026 at 6:19:03 PM
As someone that's worked on GPU drivers for shared memory systems for over 15 years, supporting hardware that was put on the market over 20 years ago, and they've "always" (in my experience) been able to dynamically assign memory pages to the GPU.The "reserved" memory is more about the guaranteed minimum to allow the thing to actually light up, and sometimes specific hardware blocks had more limited requirements (e.g. the display block might require contiguous physical addresses, or the MMU data/page tables themselves) so we would reserve a chunk to ensure they can actually be allocated with those requirements. But they tended to be a small proportion of the total "GPU Memory used".
Sure, sharing the virtual address space is less well supported, but the total amount of memory the GPU can use is flexible at runtime.
by kimixa
4/19/2026 at 7:11:00 AM
To the first question: blame Windows I guess. But even on older chips, GPU code could access memory allocated on the CPU side so this didn't cap the amount of data your GPGPU code could crunch.by fulafel
4/19/2026 at 6:09:01 PM
I remember this was mostly a BIOS setting how much memory to allocate for iGPU - and once set in the BIOS, that memory was not accessible to the underlying OS (besides GPU I/O).by littlecranky67
4/19/2026 at 7:45:16 PM
Yes, but this was to appease Windows, probably older versions and/or 32 bit versions of it.by fulafel
4/19/2026 at 11:46:17 PM
Agree, maybe "changes the physics" was too strong, shared cpu/gpu memory is not new.What is different then is the combination of
1. UMA memory (and yes, iGPU had this, pre-M1) 2. enough bandwidth / GPU throughput for local inference 3. straightforward `makeBuffer(bytesNoCopy:)` path
So, the novelty isn't the shared memory itself, but the whole chain lining up to make the Wasm linear memory -> Metal-buffer approach practical + performant enough.
(and not saying there's some Apple Silicon magic here either ... it'd work anywhere there was UMA and no-copy host-pointer path)
by agambrahma