Custom Components¶
This is a guide on how to build a simple app and custom component spec and launch it via two different schedulers.
See the Quickstart Guide for installation and basic usage.
Hello World¶
Lets start off with writing a simple “Hello World” python app. This is just a normal python program and can contain anything you’d like.
Note
This example uses Jupyter Notebook %%writefile to create local files for example purposes. Under normal usage you would have these as standalone files.
[1]:
%%writefile my_app.py
import sys
import argparse
def main(user: str) -> None:
    print(f"Hello, {user}!")
if __name__ == "__main__":
    parser = argparse.ArgumentParser(
        description="Hello world app"
    )
    parser.add_argument(
        "--user",
        type=str,
        help="the person to greet",
        required=True,
    )
    args = parser.parse_args(sys.argv[1:])
    main(args.user)
Overwriting my_app.py
Now that we have an app we can write the component file for it. This function allows us to reuse and share our app in a user friendly way.
We can use this component from the torchx cli or programmatically as part of a pipeline.
[2]:
%%writefile my_component.py
import torchx.specs as specs
def greet(user: str, image: str = "my_app:latest") -> specs.AppDef:
    return specs.AppDef(
        name="hello_world",
        roles=[
            specs.Role(
                name="greeter",
                image=image,
                entrypoint="python",
                args=[
                    "-m", "my_app",
                    "--user", user,
                ],
            )
        ],
    )
Overwriting my_component.py
We can execute our component via torchx run. The local_cwd scheduler executes the component relative to the current directory.
[3]:
%%sh
torchx run --scheduler local_cwd my_component.py:greet --user "your name"
torchx 2024-07-17 02:04:04 INFO     Tracker configurations: {}
torchx 2024-07-17 02:04:04 INFO     Log directory not set in scheduler cfg. Creating a temporary log dir that will be deleted on exit. To preserve log directory set the `log_dir` cfg option
torchx 2024-07-17 02:04:04 INFO     Log directory is: /tmp/torchx_b44hv08a
torchx 2024-07-17 02:04:04 INFO     Waiting for the app to finish...
greeter/0 Hello, your name!
torchx 2024-07-17 02:04:05 INFO     Job finished: SUCCEEDED
local_cwd://torchx/hello_world-l72k6xzs9nl7qc
If we want to run in other environments, we can build a Docker container so we can run our component in Docker enabled environments such as Kubernetes or via the local Docker scheduler.
Note
This requires Docker installed and won’t work in environments such as Google Colab. If you have not done so already follow the install instructions on: https://docs.docker.com/get-docker/
[4]:
%%writefile Dockerfile.custom
FROM ghcr.io/pytorch/torchx:0.1.0rc1
ADD my_app.py .
Overwriting Dockerfile.custom
Once we have the Dockerfile created we can create our docker image.
[5]:
%%sh
docker build -t my_app:latest -f Dockerfile.custom .
#0 building with "default" instance using docker driver
#1 [internal] load build definition from Dockerfile.custom
#1 transferring dockerfile: 158B done
#1 DONE 0.0s
#2 [internal] load metadata for ghcr.io/pytorch/torchx:0.1.0rc1
#2 DONE 0.4s
#3 [internal] load .dockerignore
#3 transferring context: 2B done
#3 DONE 0.0s
#4 [1/2] FROM ghcr.io/pytorch/torchx:0.1.0rc1@sha256:a738949601d82e7f100fa1efeb8dde0c35ce44c66726cf38596f96d78dcd7ad3
#4 DONE 0.0s
#5 [internal] load build context
#5 transferring context: 484B done
#5 DONE 0.0s
#6 [2/2] ADD my_app.py .
#6 CACHED
#7 exporting to image
#7 exporting layers done
#7 writing image sha256:593705c4d39b8ee102d0f6f22e670c233a1d92b4ad03a000c6ce1e410d793c16 done
#7 naming to docker.io/library/my_app:latest done
#7 DONE 0.0s
We can then launch it on the local scheduler.
[6]:
%%sh
torchx run --scheduler local_docker my_component.py:greet --image "my_app:latest" --user "your name"
torchx 2024-07-17 02:04:06 INFO     Tracker configurations: {}
torchx 2024-07-17 02:04:06 INFO     Checking for changes in workspace `file:///home/ec2-user/torchx/docs/source`...
torchx 2024-07-17 02:04:06 INFO     To disable workspaces pass: --workspace="" from CLI or workspace=None programmatically.
torchx 2024-07-17 02:04:06 INFO     Workspace `file:///home/ec2-user/torchx/docs/source` resolved to filesystem path `/home/ec2-user/torchx/docs/source`
torchx 2024-07-17 02:04:07 WARNING  failed to pull image my_app:latest, falling back to local: 404 Client Error for http+docker://localhost/v1.44/images/create?tag=latest&fromImage=my_app: Not Found ("pull access denied for my_app, repository does not exist or may require 'docker login': denied: requested access to the resource is denied")
torchx 2024-07-17 02:04:07 INFO     Building workspace docker image (this may take a while)...
torchx 2024-07-17 02:04:07 INFO     Step 1/4 : ARG IMAGE
torchx 2024-07-17 02:04:07 INFO     Step 2/4 : FROM $IMAGE
torchx 2024-07-17 02:04:07 INFO      ---> 593705c4d39b
torchx 2024-07-17 02:04:07 INFO     Step 3/4 : COPY . .
torchx 2024-07-17 02:04:07 INFO      ---> 0f94922fb1c1
torchx 2024-07-17 02:04:07 INFO     Step 4/4 : LABEL torchx.pytorch.org/version=0.7.0
torchx 2024-07-17 02:04:07 INFO      ---> Running in c97dda5daba1
torchx 2024-07-17 02:04:07 INFO      ---> Removed intermediate container c97dda5daba1
torchx 2024-07-17 02:04:07 INFO      ---> 2b9fac3b2ebb
torchx 2024-07-17 02:04:07 INFO     [Warning] One or more build-args [WORKSPACE] were not consumed
torchx 2024-07-17 02:04:07 INFO     Successfully built 2b9fac3b2ebb
torchx 2024-07-17 02:04:07 INFO     Built new image `sha256:2b9fac3b2ebbe5ad86995a981d772a7481555fa7e497aab8207c1aee1cb48a50` based on original image `my_app:latest` and changes in workspace `file:///home/ec2-user/torchx/docs/source` for role[0]=greeter.
torchx 2024-07-17 02:04:08 INFO     Waiting for the app to finish...
greeter/0 Hello, your name!
torchx 2024-07-17 02:04:09 INFO     Job finished: SUCCEEDED
local_docker://torchx/hello_world-nbt0qx5vk3cvzc
If you have a Kubernetes cluster you can use the Kubernetes scheduler to launch this on the cluster instead.
$ docker push my_app:latest
$ torchx run --scheduler kubernetes my_component.py:greet --image "my_app:latest" --user "your name"
Builtins¶
TorchX also provides a number of builtin components with premade images. You can discover them via:
[7]:
%%sh
torchx builtins
Found 11 builtin components:
  1. dist.ddp
  2. dist.spmd
  3. metrics.tensorboard
  4. serve.torchserve
  5. utils.binary
  6. utils.booth
  7. utils.copy
  8. utils.echo
  9. utils.python
 10. utils.sh
 11. utils.touch
You can use these either from the CLI, from a pipeline or programmatically like you would any other component.
[8]:
%%sh
torchx run utils.echo --msg "Hello :)"
torchx 2024-07-17 02:04:11 INFO     Tracker configurations: {}
torchx 2024-07-17 02:04:11 INFO     Checking for changes in workspace `file:///home/ec2-user/torchx/docs/source`...
torchx 2024-07-17 02:04:11 INFO     To disable workspaces pass: --workspace="" from CLI or workspace=None programmatically.
torchx 2024-07-17 02:04:11 INFO     Workspace `file:///home/ec2-user/torchx/docs/source` resolved to filesystem path `/home/ec2-user/torchx/docs/source`
torchx 2024-07-17 02:04:12 INFO     Building workspace docker image (this may take a while)...
torchx 2024-07-17 02:04:12 INFO     Step 1/4 : ARG IMAGE
torchx 2024-07-17 02:04:12 INFO     Step 2/4 : FROM $IMAGE
torchx 2024-07-17 02:04:12 INFO      ---> 2fd60971a176
torchx 2024-07-17 02:04:12 INFO     Step 3/4 : COPY . .
torchx 2024-07-17 02:04:12 INFO      ---> 89d40e4a8fb5
torchx 2024-07-17 02:04:12 INFO     Step 4/4 : LABEL torchx.pytorch.org/version=0.7.0
torchx 2024-07-17 02:04:12 INFO      ---> Running in e077d6aa7bf2
torchx 2024-07-17 02:04:12 INFO      ---> Removed intermediate container e077d6aa7bf2
torchx 2024-07-17 02:04:12 INFO      ---> 89cf730f8a5a
torchx 2024-07-17 02:04:12 INFO     [Warning] One or more build-args [WORKSPACE] were not consumed
torchx 2024-07-17 02:04:12 INFO     Successfully built 89cf730f8a5a
torchx 2024-07-17 02:04:12 INFO     Built new image `sha256:89cf730f8a5a91a01dc463800aa151aa87dd965a15b0e2f331e5ccc1cd4fe0b4` based on original image `ghcr.io/pytorch/torchx:0.7.0` and changes in workspace `file:///home/ec2-user/torchx/docs/source` for role[0]=echo.
torchx 2024-07-17 02:04:13 INFO     Waiting for the app to finish...
torchx 2024-07-17 02:04:13 INFO     Job finished: SUCCEEDED
echo/0 Hello :)
local_docker://torchx/echo-jdh7s6zhqzvczc