PyTorch Distributed
The Kubeflow PyTorch plugin leverages the Kubeflow training operator to offer a highly streamlined interface for conducting distributed training using different PyTorch backends.
Install the plugin
To use the PyTorch plugin, run the following command:
$ pip install flytekitplugins-kfpytorch
To enable the plugin in the backend, follow instructions outlined in the {ref}deployment-plugin-setup-k8s
guide.
Run the example on the Flyte cluster
To run the provided examples on the Flyte cluster, use the following commands:
Distributed pytorch training:
$ pyflyte run --remote pytorch_mnist.py pytorch_training_wf
Pytorch lightning training:
$ pyflyte run --remote pytorch_lightning_mnist_autoencoder.py train_workflow