Parallax

Parallax is a tool that automatically parallelizes training of a single-GPU deep learning model correctly and efficiently in distributed multi-GPU environments. Parallax correctly handles complicated auto-parallelization issues; in addition, it also leverages various optimizations to minimize communication overhead incurred by distributed training. If you are interested, you can find the technical details of Parallax in our arxiv paper.

Parallax is currently implemented on TensorFlow v1.6 and TensorFlow v1.11. In case that Parallax uses Message Passing Interface (MPI), Parallax requires AllReduce, AllGather operations implemented in Horovod v0.11.2. We plan to support multiple TensorFlow versions.

Why Parallax?

Parallax makes it easier for users to do distributed training of a deep learning model developed in a single device (e.g., GPU or CPU). A Parallax user simply specifies a single-device model graph, resource specification for distributed training and Parallax does the rest! For distributed training, Parallax supports two major communication styles: Parameter Server (PS) and Message Passing Interface (MPI). Users can choose which communication method to train their models, or Parallax can choose a communication method that works better automatically.

Parallax Execution Model

When a client initiates a deep learning job with a single-device computation graph, resource information, and optionally a flag that indicates either synchronous or asynchronous training, Parallax transforms the computation graph by analyzing its characteristics. Then, Parallax executes the transformed graph with its optimized communication layer in the distributed environment.

Parallax Benchmark

To give you an idea on how well Parallax performs, we present the following chart that shows the result of experiments done in a cluster of eight machines that are connected via Mellanox ConnectX-4 cards with 100Gbps InfiniBand. Each machine has six NVIDIA GeForce TITAN Xp GPU cards.

Parallax outperforms TensorFlow for both Resnet50 and LM1B. In addition, Parallax outperforms Horovod for LM1B.

Troubleshooting

See the Troubleshooting page and submit a new issue or contact us if you cannot find an answer.

Contact us

To contact us, send an email to [email protected].

License

Apache License 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.github		.github
doc		doc
horovod @ 1954eae		horovod @ 1954eae
license_report		license_report
parallax		parallax
tensorflow @ 36995fd		tensorflow @ 36995fd
tools		tools
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Parallax

Why Parallax?

Parallax Execution Model

Parallax Benchmark

Troubleshooting

Contact us

License

About

Uh oh!

Releases

Packages

Languages

License

videoturingtest/parallax

Folders and files

Latest commit

History

Repository files navigation

Parallax

Why Parallax?

Parallax Execution Model

Parallax Benchmark

Troubleshooting

Contact us

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages