How PyTorch implements DistributedDataParallel?

torch.nn.parallel.DistributedDataParallel() claims to be better than torch.nn.DataParallel(). Though DistributedDataParallel looks much more complicated.

We will evaluate this claim, explain what the wrapper does, and how the wrapper is implemented. I assume you know PyTorch’s dynamic computational graph as well as Python GIL. And PyTorch version is v1.0.1.

Initialization

Because of multi-process nature of the package, we need to initialize the package.

##

DistributedDataParallel Interface

DistributedDataParallel can be seen as a multi-node enhancement on top of the DataParallel. Indeed, its implementation reuses the same methods in DataParallel.

Tags:

Updated: