Given a Module that's been traced with symbolic tracing, we want to demonstrate what it would look like to write a pass that splits up the nodes into a set of N partitions and reassembles the graph ...
For a large reduction op, Inductor splits it into a few triton kernels. The FX graph segment returned from get_kernel_metadata does not work. For example: def simple_sum_reduction(x): return torch.sum ...
一部の結果でアクセス不可の可能性があるため、非表示になっています。
アクセス不可の結果を表示する