: The nvlink device linker can now produce PTX (Parallel Thread Execution) code as an output. This allows applications to benefit from Link Time Optimization (LTO) while maintaining forward compatibility for device code.

: Performance enhancements for linear algebra operations.

For full changelog: NVIDIA CUDA 12.6 Release Notes

Download the exe installer from developer.nvidia.com .