Q3 Upstreaming Refactor Design Doc

Problem 1: NGraph supports more hardware backends than MXNet

We’re starting to run into an issue where nGraph supports multiple backends for the same hardware and many more hardwares. I’m starting to think that co-opting mxnet’s CPU and GPU contexts is the wrong way to integration ngraph, instead we should have an nGraph context that takes a string like CPU, IE:CPU, or GPU:2 to select the proper nGraph backend. That way we only need to maintain one context, instead of many, and we can have a per-backend mechanism for selecting default mxnet kernels to fall back on. https://github.com/NervanaSystems/ngraph-mxnet/issues/289

Problem 1a: Testing with nGraph Context

The main issue with this idea is that validation and model testing becomes more complicated, default scripts would start defaulting to CPU, so we’d need to make custom test_ngraph_* scripts that import the unit tests and call set_default_context to use nGraph. https://github.com/NervanaSystems/ngraph-mxnet/issues/311

Problem 2: MXNet has asked acceleration libraries to use a unified Subgraph inferface

To support this, we first stop creating new NNVM operators for every graph, and instead emit a stateful Subgraph Operator node back to NNVM to run. https://github.com/NervanaSystems/ngraph-mxnet/issues/324

Second stage will be to use the fusion passes provided in the subgraph branch to call into the bridge, instead of modified GraphExecutor/CachedOp/etc. https://github.com/NervanaSystems/ngraph-mxnet/issues/288

Problem 3: MXNet has plans to deprecate GNU make in favor of cmake

Extend the current Cmake system to support nGraph: https://github.com/NervanaSystems/ngraph-mxnet/issues/307

Problem 4: Versioning and Packaging

MXNet is a big fan of including third party libraries as submodules, we need to add nGraph as a submodule. This also gives us direct control over nGraph version. https://github.com/NervanaSystems/ngraph-mxnet/issues/308

It would be nice if the bridge could also be a submodule to ease development. That's trickier, because we include a lot of MXNet headers. Not sure it's possible. https://github.com/NervanaSystems/ngraph-mxnet/issues/325

Provide feedback

Saved searches