alt.hn

3/31/2026 at 1:29:53 PM

Zml-smi: universal monitoring tool for GPUs, TPUs and NPUs

https://zml.ai/posts/zml-smi/

by steeve

4/5/2026 at 11:30:39 AM

Look into all-smi https://github.com/lablup/all-smi It supports all GPUs thinkable including Apple Silicon and many AI accelerator cards.

by serialx

3/31/2026 at 1:52:20 PM

Renaming fopen64 to intercept library calls feels like a brittle hack masquerading as "sandboxing." Why not just upstream this hardware support to nvtop instead of fragmenting the ecosystem?

by mrflop

3/31/2026 at 2:03:09 PM

sadly, sandboxing is something that can't be upstreamed. this way, sandboxing is kept in zml instead of patching mesa.

as for nvtop, great program, but we missed a few features (such as sandboxing)

by steeve

4/5/2026 at 7:53:25 AM

It looks cool and I was excited to get monitoring for the NPU on my Ryzen AI 395+, unfortunately it does not show. NPU support in linux really seems to be an afterthought.

by pstuart

4/5/2026 at 7:55:48 AM

Weird, because we tried it. It doesn’t show anything?

We use the amdsmi to get metrics. I’ll investigate.

by steeve

4/5/2026 at 7:56:53 AM

If this logic were pushed into nvtop, wouldn't the codebase become unmaintainable? Each vendor's interception method is going to be different.

by marwanet

4/5/2026 at 8:24:41 AM

[dead]

by nareyko

4/5/2026 at 2:15:58 PM

Is it capable of exposing metrics in Prometheus format?

by imcritic

4/5/2026 at 2:27:55 PM

consider it done

by steeve

4/5/2026 at 2:51:20 PM

would be nice to have cpu usage added so I have all in one?

currently I use btop which shows basic gpu load along with cpu, network, etc.

by synergy20

4/5/2026 at 9:49:18 AM

"NPU" seems to refer to trainium only?

by 152334H