Resources for Torch internal (Autograds +a), CUDA, Compiler and so on
06 Mar 2024< 목차 >
- Torch Internals (Autograd and so on)
- GPU Programming, CUDA, Compiler and Fused Kernels
- Fault Tolerance
- Network Topology
- Others
Torch Internals (Autograd and so on)
- PyTorch internals from ezyang’s blog
- Overview of PyTorch Autograd Engine
- How Computational Graphs are Constructed in PyTorch
- How Computational Graphs are Executed in PyTorch
- PyTorch Autograd Explained - In-depth Tutorial from Elliot Waite
-
PyTorch Hooks Explained - In-depth Tutorial from Elliot Waite
- Accelerating PyTorch with CUDA Graphs
GPU Programming, CUDA, Compiler and Fused Kernels
- How do Graphics Cards Work? Exploring GPU Architecture from Branch Education
- Understanding GPU Playlist from Simon Oz
-
How to Optimize a CUDA Matmul Kernel for cuBLAS-like Performance: a Worklog
- cuda-mode
- NVIDIA/TransformerEngine
- openai/blocksparse
- openai/triton
Fault Tolerance
- Multi-Datacenter Training: OpenAI’s Ambitious Plan To Beat Google’s Infrastructure
- torchrun (Elastic Launch)