torch::deploy¶
torch::deploy is a system that allows you to run multiple embedded Python
interpreters in a C++ process without a shared global interpreter lock. For more
information on how torch::deploy works internally, please see the related
arXiv paper.
Warning
This is a prototype feature. Only Linux x86 is supported, and the API may change without warning.
Getting Started¶
Installing torch::deploy¶
torch::deploy is not yet built by default in our binary releases, so to get
a copy of libtorch with torch::deploy enabled, follow the instructions for
building PyTorch from source.
When running setup.py, you will need to specify USE_DEPLOY=1, like:
export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../"}
export USE_DEPLOY=1
python setup.py bdist_wheel
python -mpip install dist/*.whl
Creating a model package in Python¶
torch::deploy can load and run Python models that are packaged with
torch.package. You can learn more about torch.package in the
torch.package documentation.
For now, let’s create a simple model that we can load and run in torch::deploy.
from torch.package import PackageExporter
import torchvision
# Instantiate some model
model = torchvision.models.resnet.resnet18()
# Package and export it.
with PackageExporter("my_package.pt") as e:
e.intern("torchvision.**")
e.extern("sys")
e.save_pickle("model", "model.pkl", model)
Now, there should be a file named my_package.pt in your working directory.
Note
Currently, torch::deploy supports only the Python standard library and
torch as extern modules in torch.package. In the future we plan
to transparently support any Conda environment you point us to.
Loading and running the model in C++¶
Let’s create a minimal C++ program to that loads the model.
#include <torch/deploy.h>
#include <torch/script.h>
#include <iostream>
#include <memory>
int main(int argc, const char* argv[]) {
if (argc != 2) {
std::cerr << "usage: example-app <path-to-exported-script-module>\n";
return -1;
}
// Start an interpreter manager governing 4 embedded interpreters.
torch::deploy::InterpreterManager manager(4);
try {
// Load the model from the torch.package.
torch::deploy::Package package = manager.loadPackage(argv[1]);
torch::deploy::ReplicatedObj model = package.loadPickle("model", "model.pkl");
} catch (const c10::Error& e) {
std::cerr << "error loading the model\n";
return -1;
}
std::cout << "ok\n";
}
This small program introduces many of the core concepts of torch::deploy.
An InterpreterManager abstracts over a collection of independent Python
interpreters, allowing you to load balance across them when running your code.
Using the InterpreterManager::loadPackage method, you can load a
torch.package from disk and make it available to all interpreters.
Package::loadPickle allows you to retrieve specific Python objects
from the package, like the ResNet model we saved earlier.
Finally, the model itself is a ReplicatedObj. This is an abstract handle to
an object that is replicated across multiple interpreters. When you interact
with a ReplicatedObj (for example, by calling forward), it will select
an free interpreter to execute that interaction.
Building and running the application¶
Assuming the above C++ program was stored in a file called, example-app.cpp, a minimal CMakeLists.txt file would look like:
cmake_minimum_required(VERSION 3.0 FATAL_ERROR)
project(deploy_tutorial)
find_package(Torch REQUIRED)
add_executable(example-app example-app.cpp)
target_link_libraries(example-app "${TORCH_LIBRARIES}")
set_property(TARGET example-app PROPERTY CXX_STANDARD 14)
The last step is configuring and building the project. Assuming that our code directory is laid out like this:
example-app/
CMakeLists.txt
example-app.cpp
We can now run the following commands to build the application from within the
example-app/ folder:
mkdir build
cd build
# Point CMake at the built version of PyTorch we just installed.
SITE_PACKAGES="$(python -c 'from distutils.sysconfig import get_python_lib; print(get_python_lib())')"
cmake -DCMAKE_PREFIX_PATH="$SITE_PACKAGES/torch" ..
cmake --build . --config Release
Now we can run our app:
./example-app /path/to/my_package.pt
Executing forward in C++¶
One you have your model loaded in C++, it is easy to execute it:
// Create a vector of inputs.
std::vector<torch::jit::IValue> inputs;
inputs.push_back(torch::ones({1, 3, 224, 224}));
// Execute the model and turn its output into a tensor.
at::Tensor output = model(inputs).toTensor();
std::cout << output.slice(/*dim=*/1, /*start=*/0, /*end=*/5) << '\n';
Notably, the model’s forward function is executing in Python, in an embedded
CPython interpreter. Note that the model is a ReplicatedObj, which means
that you can call model() from multiple threads and the forward method will
be executed on multiple independent interpreters, with no global interpreter
lock.