When using torch.compile() inside a CUDA Graph capture context, the compilation process fails because it attempts to access CUDA RNG state, which is prohibited during graph capture. This prevents ...
When using torch._inductor.aoti_compile_and_package on a model that contains both CPU and CUDA operations, the compiler incorrectly generates a call to the CUDA-specific kernel ...
NVIDIA's latest update to Compute Sanitizer introduces compile-time instrumentation to improve memory safety in CUDA C++ applications, reducing false negatives and enhancing bug detection. NVIDIA has ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results