Run on a Ray cluster¶
Work in progress
Remote cluster execution via Ray is functional but evolving. For the latest cluster configuration options, consult the Ray documentation.
For workloads too large for a single machine, radCAD can distribute runs across a Ray cluster (AWS, GCP, Kubernetes, …) using the RAY_REMOTE backend.
Install the Ray extension¶
Provision a cluster¶
Export your cloud credentials (AWS shown; see Ray docs for other providers):
Start a cluster (or connect to an existing one) using one of the configs in cluster/aws/:
# Single m5.large EC2 instance in us-west-2
ray up cluster/aws/minimal.yaml
# Verify the connection
ray exec cluster/aws/minimal.yaml 'echo "hello world"'
Run the experiment remotely¶
Connect to the cluster head, then select the RAY_REMOTE backend:
import ray
from radcad import Engine, Backend
# Connect to the cluster head node
ray.init(address="***:6379", _redis_password="***")
experiment.engine = Engine(backend=Backend.RAY_REMOTE)
result = experiment.run()
Warning
With RAY_REMOTE, you are responsible for calling ray.init(address=..., ...) to connect to the cluster before running. The local RAY backend, by contrast, initialises a local Ray instance for you.
Tear down the cluster¶
See also¶
- Choose a processing backend: the
RAY(local) backend and the others. ExecutorRayRemotein the API reference.