This tutorial seeks to teach users about using profiling tools such as nvsys, rocprof, and the torch profiler in a simple transformers training loop. We will cover how to use the PyTorch profiler to ...
This tutorial assumes that the cluster is configured to accept single jobs on each Graphical Processing Unit (GPU). Users can submit job arrays, which is the default way to submit jobs for distributed ...