5/11/2026 at 5:09:28 PM
This is amazing.. ive been working with custom CUDA kernels and https://crates.io/crates/cudarc for a long time, and this honestly looks like it could be a near drop-in replacement.im especially curious how build times would compare? Most Rust CUDA crates obv rely on calling CMake or nvcc, which can make compilation painfully slow. coincidentally, just last week i was profiling build times and found that tools like sccache can dramatically reduce rebuild times by caching artifacts - but you still end up paying for expensive custom nvcc invocations (e.g. candle by hugging face calls custom nvcc command in their kernel compilation): https://arpadvoros.com/posts/2026/05/05/speeding-up-rust-whi...
by arpadav
5/11/2026 at 6:00:19 PM
Cudarc slaps!> Most Rust CUDA crates obv rely on calling CMake or nvcc, which can make compilation painfully slow.
I anecdotally haven't hit this; see the `cuda_setup` crate I made to handle the build scripts; it is a simple `build.rs` which only recompiles if the file changes, and it's a tiny compile time (compared to the rust CPU-side code)
by the__alchemist
5/11/2026 at 6:03:48 PM
i'll have to check this out, thanks!by arpadav
5/11/2026 at 5:55:29 PM
Do other people agree cuda-oxide looks like a near dorp in replacement for cudarc?That would be amazing, but probably not imo complementarily so.
I am curious what distinguished cuda-oxide. Beyond it being totally under nv control.
by jauntywundrkind
5/11/2026 at 6:02:49 PM
perhaps not drop-in, but all my workflows with cudarc have always been "i make cuda kernel, i use cudarc for ffi to said kernels, i call via rust" - which for this case is pretty analogousbriefly looking at the repo, looks like the main workflow is using rustc-codegen-cuda to convert rust -> MIR -> pliron IR -> LLVM IR -> PTX, which is embedded in the host binary, where then cuda-core loads embedded PTX at runtime onto the GPU
but, if you arent directly making cuda kernels and just want cudarc for either calling existing kernels or other cuda driver api access then cudarc is lighter-weight option? or just use one of the sub-crates in this repo like cuda-core for those apis
by arpadav
5/12/2026 at 1:55:21 AM
Hi, author of cuda-oxide here. Yes, I think that’s basically the right framing: cudarc and cuda-oxide sit at different points in the stack.cudarc is a host-side CUDA API for Rust: loading modules, managing contexts/streams/events/memory, launching kernels, and accessing CUDA libraries/driver APIs. If your workflow is “I already have CUDA C++/PTX/CUBIN kernels and want to call them from Rust”, cudarc is a very natural fit.
cuda-oxide is focused on the other side of the problem: writing the GPU kernel itself in Rust and compiling it through rustc/MIR into GPU code. The generated PTX is then embedded in the host binary and loaded at runtime by our host-side pieces.
We include cuda-core/cuda-host because we need an end-to-end path for “write Rust kernel, build it, launch it”, but that doesn’t mean the generated PTX is tied forever to our launcher. We’d like the PTX from cuda-oxide to be usable from other host-side CUDA APIs too, including cudarc, and we’re exploring ways to make that interop smoother.
So the short version is: cudarc is about driving CUDA from Rust; cuda-oxide is about generating CUDA device code from Rust. They’re complementary rather than replacements for each other.
We also have a short ecosystem note in the book that talks about cudarc: https://nvlabs.github.io/cuda-oxide/appendix/ecosystem.html#...
by nihalpasham
5/11/2026 at 6:04:14 PM
I am observing the same from the article... is it heavily inspired by Cudarc, i.e. is this intentional, or are we reading too much into this, given Cudarc is a light abstraction over the CUDA api?by the__alchemist